Tool to aid disassembling DOS applications created with the Watcom Toolchain.
I'm striving to become a full-time developer of Free and open-source software (FOSS). Donations help me achieve that goal and are highly appreciated.
- The Watcom Toolchain
- Yet Another Disassembly Tool?
- Current State / Future Development
- Output Sample
- Getting Started
- Wcdatool Usage Information
- Contact Information
Many DOS applications of the 90s, especially games, were developed using the Watcom Toolchain. Notable examples are DOOM, Warcraft, Syndicate and Mortal Kombat, just to name a few.
Most end-users probably never have heard of Watcom, but might remember applications displaying a startup banner reading something like this: DOS/4G(W) Protected Mode Run-time [...]
. DOS/4G(W) was a popular DOS extender bundled with the Watcom Toolchain, allowing DOS applications to run in 32-bit protected mode and thus being able to reach well beyond the limits of 16-bit (MS-)DOS.
Nowadays, the Watcom Toolchain is open source and lives on as Open Watcom / Open Watcom v2 Fork.
The idea for this tool emerged when I discovered that one of my all-time favorite games, Mortal Kombat (CD version), was mainly written in Assembler (almost a line-by-line port of the arcade version) and was released unstripped (i.e. executable contains debug symbols). I tried using various decompilation/disassembly tools on it, only to realize that none seemed to be capable of dealing with the specifics of Watcom-based applications.
Hence, I began writing my own tool. What initially started out as mkdecomptool specifically for Mortal Kombat gradually became the now general-purpose Watcom Disassembly Tool (wcdatool).
Note that while wcdatool performs the tasks it is designed for quite well, it is not intended to compete with or replace high-end tools like IDA Pro or Ghidra.
Wcdatool works quite well in its current state - you'll get a well-readable, reasonably structured disassembly output (objdump format, Intel syntax). Check out issues #9 and #11 for games other than Mortal Kombat that wcdatool worked nicely for thus far. Please note that wcdatool works best when used on executables that contain debug symbols. If you come across other unstripped Watcom-based DOS applications that may be used for further testing and development, please let me know.
The next major goal is to cleanly rewrite the disassembler module and transition from static code disassembly to execution flow tracing. Also, instead of treating an executable's objects separately, a linear unified address space containing all object data will be the basis for future processing. This will allow to apply fixups on a binary level, which should simplify dealing with references that cross object boundaries, such as placeholders/stubs (which are patched via fixups at run time). Mortal Kombat 2's executable will be baseline for the new approach, as it contains code regions within its data object (which are currently neither discovered nor processed) and extensively uses placeholders/stubs for jump/call targets that cross object boundaries (which are currently not handled properly).
Last but not least, wcdatool in its current state is relatively slow, as performance has not been the main focus during development. Cython might be utilized in the future to increase performance.
Output sample for Fatal Racing (FATAL.EXE
) - the left side shows the reconstructed source files, the right side shows a portion of formatted disassembly:
There are multiple ways to use wcdatool, but the following instructions should get you started. Don't let the amount of information provided below discourage you, the tool is easier to use than it might seem. The instructions assume that you are using Linux. For Windows users, the easiest way to go is to use Windows Subsystem for Linux (WSL):
-
Check the following requirements:
Wcdatool:
Python (>=3.6.0), wdump (part of Open Watcom v2), objdump (part of binutils)
(both wdump and objdump need to be accessible viaPATH
)Open Watcom v2:
gcc -or- clang (for 64-bit builds), DOSEMU -or- DOSBox (for wgml utility)
(only relevant if Open Watcom v2 is built from sources; the project also provides pre-compiled binaries) -
Clone wcdatool's repository (-or- download and extract a release):
# git clone https://github.com/fonic/wcdatool.git
-
Download, build and install Open Watcom v2 (-or- download and install pre-compiled binaries):
# cd wcdatool/OpenWatcom # ./1_download.sh # ./2_build.sh # ./3_install_linux.sh /opt/openwatcom /opt/bin/openwatcom
NOTE: these scripts are provided for convenience, they are not part of the Open Watcom v2 project itself
-
Copy the executables to be disassembled to
wcdatool/Executables
, e.g. for Mortal Kombat:# cp <source-dir>/MK1.EXE wcdatool/Executables # cp <source-dir>/MK2.EXE wcdatool/Executables # cp <source-dir>/MK3.EXE wcdatool/Executables
NOTE: file names of executables are used to locate corresponding object hint files (see step 5)
-
Create/update object hint files in
wcdatool/Hints
(optional; skip when just getting started):Object hints may be used to manually affect the disassembly process (e.g. force decoding of certain regions as code/data, specify data decoding mode, define data structs, add comments). Please refer to included object hint files for Mortal Kombat, Fatal Racing and Pac-Man VR for details regarding capabilities and syntax.
NOTE: hint files must be stored as
wcdatool/Hints/<name-of-executable>.txt
(case-sensitive, e.g.wcdatool/Executables/MK1.EXE
->wcdatool/Hints/MK1.EXE.txt
) to be picked up automatically by the included scripts -
Let wcdatool process all provided executables (for the example executables listed in step 4, this will take ~3min. and generate ~1.5GB worth of data):
# wcdatool/Scripts/process-all-executables.sh
-or- Let wcdatool process a single executable:
# wcdatool/Scripts/process-single.executable.sh <name-of-executable>
-or- Run wcdatool manually (use
--help
to display detailed usage information or see below):# python wcdatool/Wcdatool/wcdatool.py -od wcdatool/Output -wao wcdatool/Hints/<name-of-executable>.txt wcdatool/Executables/<name-of-executable>
NOTE: it is completely normal and expected for wcdatool to produce LOTS of warnings; ignore those when just getting started (see step 8 for details)
-
Have a look at the results in
wcdatool/Output
:- File
<name-of-executable>_zzz_log.txt
contains log messages (same as console output, but without coloring/formatting) - Files
<name-of-executable>_disasm_object_x_disassembly_plain.asm
contain plain disassembly (unmodified objdump output, useful for reference) - Files
<name-of-executable>_disasm_object_x_disassembly_formatted.asm
contain formatted disassembly (this is arguably the most interesting/useful output) - Files
<name-of-executable>_disasm_object_x_disassembly_formatted_deduplicated.asm
contain formatted deduplicated disassembly (same as above, but with data portions being compressed for increased readability where applicable) - Folder
<name-of-executable>_modules
contains formatted disassembly split into separate files (same as above, attempts to reconstruct an application's original source file structure if corresponding debug information is available)
NOTE: if you are new to assembler/assembly language, check out this x86 Assembly Guide
- File
-
Refine the output by analyzing the disassembly, updating the object hints and re-running wcdatool (i.e. loop steps 5-8):
- Identify and add hints for regions in code objects that are actually data (look for
; misplaced item
comments,(bad)
assembly instructions and labels with trailing; access size
comments) - Identify and add hints for regions in data objects that are actually code (look for
call
/jmp
instructions in code objects with fixup targets pointing to data objects) - Check section
Possible object hints
of wcdatool's console output / log file for suggestions (not guaranteed to be correct, but likely a good starting point) - The ultimate goal is to eliminate all (or at least most) warnings issued by wcdatool. Each warning points out a region of the disassembly that does currently seem flawed and therefore requires further attention/investigation. Note that there is a cascading effect at work (e.g. a region of data that is falsely intepreted as code may produce bogus branches, leading to further warnings), thus warnings should be tackled one (or few) at a time from first to last with wcdatool re-runs in between
NOTE: this is by far the most time-consuming part, but crucial to achieve good and clean results (!)
- Identify and add hints for regions in code objects that are actually data (look for
Usage: wcdatool.py [-wde|--wdump-exec PATH] [-ode|--objdump-exec PATH]
[-wdo|--wdump-output PATH] [-wao|--wdump-addout PATH]
[-od|--output-dir PATH] [-cm|--color-mode VALUE]
[-id|--interactive-debugger] [-is|--interactive-shell]
[-h|--help] FILE
Tool to aid disassembling DOS applications created with the Watcom Toolchain.
Positionals:
FILE Path to input executable to disassemble
(.exe file)
Options:
-wde PATH, --wdump-exec PATH Path to wdump executable (default: 'wdump')
-ode PATH, --objdump-exec PATH Path to objdump executable (default:
'objdump')
-wdo PATH, --wdump-output PATH Path to file containing pre-generated wdump
output to read/parse instead of running
wdump
-wao PATH, --wdump-addout PATH Path to file containing additional wdump
output to read/parse (mainly used for object
hints)
-od PATH, --output-dir PATH Path to output directory for storing
generated content (default: '.')
-cm VALUE, --color-mode VALUE Enable color mode (choices: 'auto', 'true',
'false') (default: 'auto')
-id, --interactive-debugger Drop to interactive debugger before exiting
to allow inspecting internal data structures
-is, --interactive-shell Drop to interactive shell before exiting to
allow inspecting internal data structures
-h, --help Display usage information (this message)
If you want to get in touch with me, give feedback, ask questions or simply need someone to talk to, please open an issue here on GitHub. Make sure to provide an email address if you prefer personal/private contact.
Last updated: 10/24/24