Skip to content

miketheman/mishegos

 
 

Repository files navigation

mishegos

CI

A differential fuzzer for x86 decoders.

mishegos

Read more about mishegos in its accompanying blog post and academic publication (paper | recording | slides).

@InProceedings{woodruff21differential,
  author       = "William Woodruff and Niki Carroll and Sebastiaan Peters",
  title        = "Differential analysis of x86-64 instruction decoders",
  booktitle    = "Proceedings of the Seventh Language-Theoretic Security Workshop~({LangSec}) at the {IEEE} Symposium on Security and Privacy",
  year         = "2021",
  month        = "May"
}

Usage

Start with a clone, including submodules:

git clone --recurse-submodules https://github.com/trailofbits/mishegos

Building

mishegos is most easily built within Docker:

docker build -t mishegos .

Alternatively, you can try building it directly.

Make sure you have binutils-dev (or however your system provides libopcodes) installed:

make
# or
make debug

Running

Run the fuzzer for a bit:

./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

mishegos checks for three environment variables:

  • V=1 enables verbose output on stderr
  • D=1 enables the "dummy" mutation mode for debugging purposes
  • M=1 enables the "manual" mutation mode (i.e., read from stdin)
  • MODE=mode can be used to configure the mutation mode in the absence of D and M
    • Valid mutation modes are sliding (default), havoc, and structured

Convert mishegos's raw output into JSONL suitable for analysis:

./src/mish2jsonl/mish2jsonl /tmp/mishegos > /tmp/mishegos.jsonl

mish2jsonl checks for V=1 to enable verbose output on stderr.

Run an analysis/filter pass group on the results:

./src/analysis/analysis -p same-size-different-decodings < /tmp/mishegos.jsonl > /tmp/mishegos.interesting

Generate an ugly pretty visualization of the filtered results:

./src/mishmat/mishmat < /tmp/mishegos.interesting > /tmp/mishegos.html
open /tmp/mishegos.html

Tip: The HTML file that mishmat generates could be hundreds of megabytes large, which will likely result in a bad browser viewing experience. Using the split tool, you can create multiple smaller HTML files with a specified number of entries per file (10,000 in the following example) and load each of them separately:

mkdir /tmp/mishegos-html
split -d --lines=10000 - /tmp/mishegos-html/mishegos_ \
    --additional-suffix='.html' --filter='./src/mishmat/mishmat > $FILE' \
    < /tmp/mishegos.interesting

Contributing

We welcome contributors to mishegos!

A guide for adding new disassembler workers can be found here.

Performance notes

All numbers below correspond to the following run:

V=1 timeout 60s ./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

Outside Docker:

  • On a Linux desktop (Ubuntu 20.04, Ryzen 5 3600, 32GB DDR4):
    • Commit d80063a
    • 8 workers (no udis86) + 1 mishegos fuzzer process
    • 8.7M outputs/minute
    • 9 cores pinned

TODO

  • Performance improvements
    • Break cohort collection out into a separate process (requires re-addition of semaphores)
    • Maybe use a better data structure for input/output/cohort slots
  • Add a scaling factor for workers, e.g. spawn N of each worker
  • Pre-analysis normalization (whitespace, immediate representation, prefixes)
  • Analysis strategies:
    • Filter by length, decode status discrepancies
    • Easy: lexical comparison
    • Easy: reassembly + effects modeling (maybe with microx?)
  • Scoring ideas:
    • Low value: Flag/prefix discrepancies
    • Medium value: Decode success/failure/crash discrepancies
    • High value: Decode discrepancies with differing control flow, operands, maybe some immediates
  • Visualization ideas:
    • Basic but not really basic: some kind of mouse-over differential visualization

License

mishegos is licensed and distributed under the Apache v2.0 license. Contact us if you’re looking for an exception to the terms.

Releases

No releases published

Packages

No packages published

Languages

  • C++ 39.5%
  • C 33.3%
  • Ruby 13.6%
  • Makefile 7.4%
  • Rust 4.9%
  • Dockerfile 0.7%
  • CMake 0.6%