Naenie (Many in high valyrian words) tries to generate valid code mutations for JavaScript code using calls between WASM and JS as the mutation operator. This tool is a proof of concept yet, and it is growing right now.
It is possible to obtain an abstract syntax tree (AST) from a JavaScript input file. Babel parser brings an AST structure base on spec. Basically, this tool (Naenie) extends those nodes with extra information.
First, we analyze the static information from the JavaScript code like literal numbers, strings, booleans, etc. After that, we evaluate an instrumented version of the input code with some provided script to explore the runtime of the target. Later, the subtrees that can be translated to a Web Assembly Text (WAT) function are detected. The final step is to generate Web assembly binary code (WASM) and validate it.
The final output is a JavaScript file with its WASM complement.
Also, we implement a "sandbox" wrapper to test the generated code in Browser or NodeJS environments with some workload using the both implementations: the original one and the mutated one.
Basically, we have three inputs for the tool:
- The file to be mutated
- The coverage script to evaluate de input script finding the maximum coverage
- The workload script to evaluate the mutation against the original one.
And two outputs:
- The mutated javascript code who calls WASM code to complement it
- The complement WASM code
The objective of this tool is to find subtrees in the script AST and then substitute them for an equivalent call to WASM code. Due to JavaScript dynamic behavior, we are focusing to arithmetic expression up to date.
The image above shows a tree piece with the translatable subtrees drawn as red circles.
We defined three principal parameters: min-size (min subtree size to replace), max-size (max subtree size to replace) and probability-threshold ( [0, 1] probability to translate the subtree ).
The preliminary analysis of the tools testing shows a overhead invocating WASM functions.
The main reason is to ensure the arithmetic subtree returns numeric values indeed (boolean values are mapped as int32 values, 0 and 1), watching for "large" integers and floats in runtime evaluation to create a correct mapping between them and the WASM numeric types.
If one child in the arithmetic subtree (always binary subtree) cannot be accessed from the WASM context then this subtree is passed to the translated node as a function argument, for example:
var a = 1 + a[0]
Is translated to
var a = ww.Wrapper(a[0])
(function $Wrapper (param i32) (return i32)
i32.const 1
get_local 0
i32.add
)
We need to care about boolean operations shortcut too
Javascript does not support int64 function signature for WASM up to date dcodeIO/webassembly#26
You can use two options to run the mutator:
- Using ts-node:
npm run ts src/main.dev.ts <target> <coverage> <workload>
- Compile the complete code to a ES6 JavaScript library
npm run compile
and then execute the compiled codenode dist/naenie.js
Help output
-v output the version number
-t --target <target> Target script
-c --coverage <coverage> Coverage script
-w --workload <workload> Workload script
-m --minumum <minimum> Minimum tree size to translate
-M --maximum <maximum> Maximum tree size to translate
-u --threshold <threshold> Probability to translate subtree
-s --sandbox <sandbox> Sandbox wrapper: 'node' or 'browser'
-h, --help output usage information
We use the official tool set from WebAssembly to compile and validate our code generation
We provide a minimal tool to mutate JavaScript code to an hybrid between JS and WASM languages.
Run python tests/run_all.py
to execute the complete pipeline to mutate and profile target test scripts.
We open and profile the both, mutated and original codes, with the workload script. This process is done in an automated way using the 'chrome-remote-interface' and the 'chrome-launcher' node packages. The profiling output file protocol is showed below:
{
nodes: {
id: number,
callFrame: {
"functionName": string,
scriptId: string,
url: string,
lineNumber: number,
columnNumber: number
},
hitCount: number,
children: number[],
positionTicks?: {
line: number,
ticks: number
}[]
}[],
startTime: number,
endTime: number,
samples: number[],
timeDeltas: number[]
}
From https://chromedevtools.github.io/devtools-protocol/tot/Profiler/#type-Profile
Mutated call graph
Original call graph
- Test subjects
- quicksort (DONE)
- zip (DONE)
- sudoku (In process)
- md5 (In process)
- rsa (In process)
- rc4 (In process)
- canny (In process)
- lcs (In process)
- laguerre (In process)
- linreg (In process)
- mandelbrot (DONE)
- sha256 (DONE)
- Mutation output
- Garbage collection calls as noise
- Mutated candidate performance and overhead
- Execution time
- Memory
- CPU
- Extend translatable subtrees coverage
- Extend testing to Firefox, Edge, Safari and NodeJS profiling
- Implement an oracle for the mutated code using a provided test suite
mutation(script)(wl) == script(wl)