Reduce parser size and compile time #85
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes a few optimizations to the grammar, to reduce the size of the generated parser, and also regenerates the parser with the latest
tree-sitter
CLI, which contains tree-sitter/tree-sitter#3234, an optimization to the compile time of the generated C code, which affects this parser significantly.This should unlock compiling this parser to WASM and packaging it in a Zed extension. Compiling to wasm is still slow (about 45 seconds on my M3 MacBook). Compiling to aarch64 only takes 1.5 seconds. I still don't understand why clang is so much slower when compiling to wasm than when compiling to native.
I believe that the reason that this parser compiles so much slower than many others is that its generated
ts_lex
function has a large number of states, which I think is due to some complex tokens in this grammar:unquoted
andunquoted_in_list
, which both expand to fairly complex sequence of different anonymous tokens defined by separate regexes.identifier
rules.I feel like there is probably a lot of room to simplify some of this, but I didn't attempt it in this PR.
/cc @fdncred