-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement jest-haste-map
instead of node-haste
#896
Conversation
Actually the time it takes to run a fast test (~10ms) on www with this is 2s overall. That includes the 500ms overhead of loading jsdom too, so we are already pretty close to 1s for building the haste map. |
ab2582a
to
db220d7
Compare
path.sep + '**' + path.sep + '*.{' + extensions.join(',') + '}'; | ||
|
||
return Promise.all(roots.map( | ||
root => globPromise(root + pattern) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this nice and simple implementation of the node crawler actually turns out to have 2x worse performance than the one in node-haste. I might revert to that one instead. Didn't think glob would be sooo slow.
8e447d9
to
10f8caf
Compare
Done in e3189c4: "Clean up HasteResolver and file caching for all of this". Now we only create one cache file. It is amended by "HasteResolver" with mocking information and is directly read from a TestWorker. One nice side-effect of all this work is that Jest's own test suite is almost 2x as fast now. It only takes 2.5s instead of 4-5s. |
/* | ||
* original author: dead_horse <[email protected]> | ||
* ported by: yaycmyk <[email protected]> | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could take over the “fastpath” name on npm :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"fasterpath" :)
congrats on the speed improvements, that’s awesome! I don’t have too much to say about the code. The indentation level of promises can also get quite deep :-) |
Should we write a dex with the haste resolution rules? |
we should try to get the “fastpath” module and npm and publish the two file crawlers seperately + an watchman auto-detection function. |
A quick synthetic benchmark seems to suggest that native |
Thanks for the review.
|
01bd200
to
b743e43
Compare
I made a bunch of micro-optimizations in ed2ee78 Since both my metadata objects for all files/modules contain the same fixed amount of items, using constant keys and arrays was a great optimization and the code complexity stayed pretty much the same ( The file size reduction of using constant keys and arrays instead of long string names and objects was 10 mb (!) down from 26 to 16 on a large code base. I was then able to exclude 5 % of the files from the original list and am now down to 14 mb. The parse time went down from 500ms to 250ms. The write time, interestingly, is still at 300ms. I looked into compacting the paths but that would only yield a win of another 2mb or so but it would add additional post-processing. I believe with this model the file size will grow to 30 mb over the next 3-4 years and decode in less than a second which will be more than good enough. Then, someone can consider different serialization models but right now we are in the diminishing-returns territory. The data model is simple enough and the code is easy enough to change. Things I tried:
|
0607cf7
to
ead50f7
Compare
function findNative(roots, extensions, ignore, callback) { | ||
const args = [].concat(roots); | ||
args.push('-type', 'f'); | ||
extensions.forEach((ext, index) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need parentheses around the extensions
find x -type f -name '*.json' -o -name '*.js'
is not equivalent to
find x -type f \( -name '*.json' -o -name '*.js' \)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, what's the difference? It seems to work fine on www for example. This is what it was like in node-haste1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the first one would also match directories with ending with “.js”
static getCacheFilePath(tmpdir) { | ||
const hash = crypto.createHash('md5'); | ||
Array.from(arguments).slice(1).forEach(arg => hash.update(arg)); | ||
return path.join(tmpdir, hash.digest('hex')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to add a human-readable prefix here. That makes it easier to delete these files manually if necessary for debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea! Let's use the name from the options and make it tmpdir/<name>-hash
.
return path.join(tmpdir, hash.digest('hex')); | ||
} | ||
|
||
build() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I’d prefer this to be a static method that returns a promise that resolves to an instance. That way, there’s no need to call this.build().then(…)
/ hasteMap.build().then(…)
anywhere.
But that’s a cosmetic change, so feel free to ignore this comment :-)
This looks solid in general. To make this the base of node-haste (and have RN packager continue to use node-haste), we need better decoupling of caching, crawling, building the map, and extracting requires. |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This is a new haste map implementation for Jest which is much more scalable than node-haste.
node-haste2 isn't well designed and not scalable for short-lived services like Jest. The startup time of node-haste1 vs. node-haste2 on www is almost the same, both between 6 and 8 seconds which is not acceptable for our engineers. This implementation is attempting to accomplish a much reduced and more scalable startup time. It also has reduced scope – the goal of this is to only build a haste map and provide a way to resolve a one-level deep dependency tree, which is all that Jest really needs from node-haste.
jest-haste-map
can serve as an ideal basis for rewriting node-haste2 (into node-haste3!).With this, the cold start time (building the entire haste map) is now about 14 seconds on www (that's acceptable) but the incremental invocation is only 2 seconds (@kyldvs will love me). I haven't heavily micro-optimized the JavaScript in
packages/jest-haste-map/src/index.js
and there is a bunch of data copying withHasteResolver.js
– I believe I can get close to 1 second with these optimizations once I'm done. One of the goals I have is to allow tacking on data (list of mocks) to the haste map and serialize the haste map and read that one in directly in the workers instead of keeping two caches.On every startup node-haste2 asks watchman (or node) for a list of all files. It builds up an in memory-file system and then processes every single file through an insane amount of promise chaining. It takes about 3 seconds on react-native to do this on a warm start. With this new implementation the cold start is about 1.5 seconds and warm start is way less than a second.
jest-haste-map
only does the minimum processing necessary: it retrieves a list of all files only on the first start when using watchman and subsequent invocations only receive the deltas of changed files since the last time Jest has run. It further processes only files that have been changed or added. The node crawler isn't as optimized but it should be faster than node-haste2 as well.What to review:
packages/jest-haste-map/src/index.js
packages/jest-haste-map/src/worker.js
packages/jest-haste-map/src/crawlers/
jest-haste-map
were lifted from node-haste2 and just received small code style updates (removed babel compilation etc.)src/resolvers/HasteResolver.js
for usage.cc @davidaurelio @kentaromiura @jeffmo @vjeux
cc @amasad – if you could take a quick look, I would greatly appreciate it :)