Spice86 is a tool to execute, reverse engineer and rewrite real mode DOS programs for which source code is not available.
Release are available on Nuget.
Pre-releases are also available on the Release page
NOTE: This is a port, and a continuation from the original Java Spice86.
It requires .NET 8 and runs on Windows, macOS, and Linux.
Rewriting a program from only the binary is a hard task.
Spice86 is a tool that helps you do so with a methodic divide and conquer approach.
General process:
- You start by emulating the program in the Spice86 emulator.
- At the end of each run, the emulator dumps some runtime data (memory dump and execution flow)
- You load those data into ghidra via the spice86-ghidra-plugin
- The plugin converts the assembly instructions in the memory dump to C# that can be loaded into spice86 and used either partially or completely instead of the assembly code.
- This allows you to gradually reimplement the assembly code with your C# methods
- This is helpful because:
- Small sequences of assembly can be statically analyzed and are generally easy to translate to a higher level language.
- You work all the time with a fully working version of the program so it is relatively easy to catch mistakes early.
- Rewriting code function by function allows you to discover the intent of the author.
This is a .NET program, you run it with the regular command line or dotnet run. Example with running a program called file.exe:
COM files and BIOS files are also supported.
It is recommended to set SPICE86_DUMPS_FOLDER environment variable pointing to where the emulator should dump the runtime data.
If the variable is set or if –RecordedDataDirectory parameter is passed, the emulator will dump a bunch of information about the run there. If nothing is set, data will be dumped in the current directory.
If there is already data there the emulator will load it first and complete it, you don’t need to start from zero each time!
--Debug (Default: false) Starts the program paused.
--Ems (Default: false) Enables EMS memory. EMS adds 8 MB of memory accessible to DOS programs through the EMM Page Frame.
--A20Gate (Default: false) Disables the 20th address line to support programs relying on the rollover of memory addresses above the HMA (slightly above 1 MB).
-m, --Mt32RomsPath Zip file or directory containing the MT-32 ROM files
-c, --CDrive Path to C drive, default is exe parent
-r, --RecordedDataDirectory Directory to dump data to when not specified otherwise. Working directory if blank
-e, --Exe Required. Path to executable
-a, --ExeArgs List of parameters to give to the emulated program
-x, --ExpectedChecksum Hexadecimal string representing the expected SHA256 checksum of the emulated program
-f, --FailOnUnhandledPort (Default: false) If true, will fail when encountering an unhandled IO port. Useful to check for unimplemented hardware. false by default.
-g, --GdbPort gdb port, if empty gdb server will not be created. If not empty, application will pause until gdb connects
-o, --OverrideSupplierClassName Name of a class that will generate the initial function information. See documentation for more information.
-p, --ProgramEntryPointSegment (Default: 4096) Segment where to load the program. DOS PSP and MCB will be created before it.
-u, --UseCodeOverride (Default: true) if false it will use the names provided by overrideSupplierClassName but not the code
-i, --InstructionsPerSecond if blank will use time based timer.
-t, --TimeMultiplier (Default: 1)
Spice86 speaks the GDB remote protocol:
- it supports most of the commands you need to debug.
- it also provides custom GDB commands to do dynamic analysis.
Alternatively, Spice86 has a home-grown debugger.
You need to specify a port for the GDB server to start when launching Spice86:
Spice86 will wait for GDB to connect before starting execution so that you can setup breakpoints and so on.
Here is how to connect from GDB command line client and how to set the architecture:
(gdb) target remote localhost:10000
(gdb) set architecture i8086
You can add breakpoints, step, view memory and so on.
Example with a breakpoint on VGA VRAM writes:
Viewing assembly:
Removing a breakpoint:
Searching for a sequence of bytes in memory (start address 0, length F0000, ascii bytes of ‘Spice86’ string):
(gdb) find /b 0x0, 0xF0000, 0x53, 0x70, 0x69, 0x63, 0x65, 0x38, 0x36
GDB does not support x86 real mode segmented addressing, so pointers need to refer to the actual physical address in memory. VRAM at address A000:0000 would be 0xA0000 in GDB.
Similarly, The $pc variable in GDB will be exposed by Spice86 as the physical address pointed by CS:IP.
The list of custom commands can be displayed like this:
Dumps everything described below in one shot. Files are created in the dump folder as explained here
Several files are produced:
- spice86dumpMemoryDump.bin: Snapshot of the real mode address space. Contains the instructions that are actually loaded and executed. They may differ from the exe you are running because DOS programs can rewrite some of their instructions / load additional modules in memory.
- spice86dumpExecutionFlow.json: Contains information used by the spice86-ghidra-plugin to make sense of the memory dump, like addresses of the functions, the labels, the instructions that have been executed …
Break after x emulated CPU Cycles:
(gdb) monitor breakCycles 1000
Break at the end of the emulated program:
#Refreshing screen or buffers while debugging
(gdb) monitor vbuffer refresh
For a pleasing and productive experience with GDB, the seerGDB client is highly recommended.
Concrete example with Cryo Dune here.
First run your program and make sure everything works fine in Spice86. If you encounter issues it could be due to unimplemented hardware / DOS / BIOS features.
When Spice86 exits, it should dump data in current folder or in folder specified by env variable
Open the data in ghidra with the spice86-ghidra-plugin and generate code. You can import the generated files in a template project you generate via the spice86-dotnet-templates:
dotnet new spice86.project
You can provide your own C# code to override the program original assembly code.
Spice86 can take in input an instance of Spice86.Core.Emulator.Function.IOverrideSupplier that builds a mapping between the memory address of functions and their C# overrides.
For a complete example you can check the source code of Cryogenic.
Here is a simple example of how it would look like:
6 Comments
johnklos
Forty years ago I had a Sinclair QL with an 8086 emulator. Because the Sinclair QL had preemptive multitasking, I could easily search memory for patterns, monitor locations, stop and start the emulation, or change memory programmatically and easily from the QDOS side. It was worlds easier than using a debugger, particularly since I didn't own an 8086 system.
I always thought it was a clever way to get insights in to software while it was running that wasn't available to people with 8086 systems, and it's interesting to see this idea so many years later.
DrNosferatu
A tutorial on how to reverse engineer a simple DOS game would be absolutely awesome!
bernadus_edwin
Why are so many emulators written in C#?
gexos
Reverse engineering old games is like digital archaeology—except instead of digging up fossils, you’re unearthing spaghetti code and DRM nightmares. Spice86 seems like an exciting new shovel for the job!
eminence32
Question from a reverse-engineering noob:
Why can't ghidra (or any other reverse engineering tool) be used directly on the .exe? Why do you have to go through this emulator? Is it because the thing you want to debug only runs in x86 realmode?
ggambetta
Oooh, I LOVE this! Especially the ability to "Overriding emulated code with C# code" I had a similar idea years ago (https://gabrielgambetta.com/remakes.html), not in the context of a debugger or reverse engineering per se, but in the context of remakes and "special edition" games. Not entirely surprised that this is a byproduct of OpenRakis. Amazing work!