Second version of Legion, with progresses and TODOs
- Independent repository
- An online doc
- Test
tracejump
- Replace
QEMU
withtracejump
-
tracejump
optimisation:- Investigate the difference between
tracejump
instrumentation and SIMGR
- Investigate the difference between
- Check into
constraints()
to see how constraints are collected - In expansion stage, run
tracer
starting from the node selected in tree policy, instead of from the root.- Call
step()
on states:- Cannot tell which successor to choose
-
simgr.explore()
:- Cannot use it together with tracer
-
simgr.run()
:- Runs into a dead-end state
- Uses
step()
internally
- Fixed the logic to choose successors
- Call
-
Run on pre-instrumentation binary
- Program with loops:
- Why constraints are missing?:
- Cause repeated bytes recorded by
tracejump
are not recorded by SIMGR
- Cause repeated bytes recorded by
- match the bytes recorded by
tracejump
with the ones in SIMGR
- Why constraints are missing?:
- CGC programs
- LAVA-M programs
- Four-byte-word sample PUT
- Replace
QEMU
withtracejump
- Quick Sampler
-
Keep$\delta$ instead of constraints?
- Compare time: Legion -
tracejump
?= random -tracejump
:- Legion is way more slower on one-byte-input
- Test on inputs with more bytes (choke-point)
- simpler loop:
-
simple_while.c
: - check assembly, make sure loops are not simplified away
-
for
loops
-
- study
tracejump
- fix bugs in
tracejump
- sample PUT triggers the difference between
tracejump
& SIMGR:- If any:
- caused by repeated bytes that are not recorded by SIMGR
- load the assembly or the binary in GDB, scan step through it.
- Fixing the mismatch
- If any:
- Correct the names in Pie Chart
- Correct the counters in the algorithm
- Test on inputs with more bytes
- Test on inputs with
for
loops Optimisation: avoid executing the binary on inputs that showed up before- Fixing the mismatch between instrumentation and tracer
- Mark a node as exhausted if quick sampler cannot find any new in_str from it
- A automatic program to compare the performance between legion and given benchmark
- Fix back-propagation: assign rewards according to the in_str generated
- Version-control Angr
- Cannot keep symbolic execution states with preconstraints in the MCTS tree node, otherwise, future symbolic execution will be limited to this input.
- Four kinds of nodes:
- White: In TraceJump + Not sure if in Angr + check Symbolic state later + may have simulation child
- Red: In TraceJump + Confirmed in Angr + has Symbolic state + has Simulation child
- Black: In TraceJump + Confirmed not in Angr + No Symbolic state + No Simulation child
- Gold: Not in TraceJump + Not in Angr + Same Symbolic state as parent + is a Simulation child
- Purple: Unknown TJ path + SymEx found in Angr + has Symbolic state + is a Phantom Node
- Installation order:
Angr
->Cle
->Claripy
- Angr: Fixed the loggers of angr, so that it will not affect importers
- Claripy:
- Added a new approximate constraint solver backend: Quick Sampler
- An assertion on the length of
exprs