Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(@angular/build): address prerendering in-memory ESM resolution in Node.js 22.2.0 and later #27721

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

alan-agius4
Copy link
Collaborator

@alan-agius4 alan-agius4 commented May 27, 2024

Node.js 22.2.0 introduced a breaking change affecting custom ESM resolution.

For more context, see: Node.js issue #53097

Closes: #27674

@angular-robot angular-robot bot added the area: build & ci Related the build and CI infrastructure of the project label May 27, 2024
@alan-agius4 alan-agius4 force-pushed the prerender-node22 branch 3 times, most recently from ff460e8 to e2070bc Compare May 27, 2024 09:19
@alan-agius4 alan-agius4 added the target: patch This PR is targeted for the next patch release label May 27, 2024
@alan-agius4 alan-agius4 force-pushed the prerender-node22 branch 3 times, most recently from b6b744b to 7a5c5e2 Compare May 27, 2024 10:25
@alan-agius4 alan-agius4 requested a review from clydin May 27, 2024 10:25
@alan-agius4 alan-agius4 marked this pull request as ready for review May 27, 2024 10:25
… Node.js 22.2.0 and later

Node.js 22.2.0 introduced a breaking change affecting custom ESM resolution.

For more context, see: [Node.js issue #53097](nodejs/node#53097)

Closes: #53097
This update ensures that we can test prerendering with the ESM loader breaking change. For more details, see: nodejs/node#53097
@alan-agius4 alan-agius4 added action: review The PR is still awaiting reviews from at least one requested reviewer state: blocked on upstream labels May 27, 2024
@alan-agius4
Copy link
Collaborator Author

alan-agius4 commented May 28, 2024

Blocking as there are discussion about reverting the Node.js breaking change nodejs/node#53182. Revert PR nodejs/node#53183

@GeoffreyBooth
Copy link

@alan-agius4 Do you mind explaining the approach taken in this PR? It seems like you refactored from using worker threads to using child processes?

@alan-agius4
Copy link
Collaborator Author

alan-agius4 commented May 29, 2024

@GeoffreyBooth, the refactoring in this PR involves the main process forking a child process and applying the --import flag. The child process will then be responsible for creating the thread workers.

Before

[main process] -> [thread workers --import]

Now

[main process] -> [forked process --import] -> [thread workers]

The above, will only work for Node.js 22.2 as the ESM loaders in workers are not inherited from the main process on older versions of Node.js, hence for older versions we have another code path.

[main process] -> [forked process --import (noop)] -> [thread workers --import]

@GeoffreyBooth
Copy link

the refactoring in this PR involves the main process forking a child process and applying the --import flag. The child process will then be responsible for creating the thread workers.

Thanks. Do you have a way to measure the performance impact of using child processes instead of workers?

So you understand the context, we’re wondering about the necessity of supporting different hooks for different threads, as doing so adds a large amount of complexity to a feature that’s already quite complex.

@alan-agius4
Copy link
Collaborator Author

@GeoffreyBooth, I can run some tests and provide you with some numbers tomorrow, as I am based in Europe. However, the performance impact will vary depending on the application's size and complexity. In a simple "hello world" application, the performance regression is negligible. But as the application grows, the additional IPC layer introduced by the forked process might become a bottleneck, especially when transferring large amounts of data between the main process and the thread worker through the forked process and vice-versa.

@GeoffreyBooth
Copy link

However, the performance impact will vary depending on the application's size and complexity.

Of course, I was just hoping to get a general idea of the difference for an app of nontrivial size.

Once you're using child processes, is there a reason to use worker threads?

@alan-agius4
Copy link
Collaborator Author

Once you're using child processes, is there a reason to use worker threads?

Yes, we use worker threads to prerender an N number of routes in parallel. Additionally, we need to render the application in isolation to ensure that globalThis and the global scopes of the forked process remain unaltered by the rendering process, which could otherwise be affected and modified by running the application.

Thanks. Do you have a way to measure the performance impact of using child processes instead of workers?

Actually, this highlighted a bug in the above implementation, which I haven't had time to fix yet. The issue is that in big apps we are sending a large amount of data from the forked process to the main process via a single message, causing an Invalid string length error. I suspect that there will be an extra latency introduced by the additional IPC layer in the forked process. Previously, the worker process sent data directly to the main process. However, now the data flows through an intermediate forked process, adding complexity to the communication.

@GeoffreyBooth
Copy link

Yes, we use worker threads to prerender an N number of routes in parallel.

Right, but why not use child processes for this purpose? What advantages do worker threads provide? (Sincere question.)

Along those lines, why spawn X number of child processes which then spawn Y number of worker threads? Why not just fork Y number of child processes directly?

I was assuming that threads would be faster and the only question was how much, but then I did some searching and found this: https://github.com/orgs/nodejs/discussions/44264. I’m curious to see if your test case shows worker threads to be faster, once you resolve the issues on this branch. If this branch is faster, would you want to keep this new implementation even if worker threads became an option again? If not, why not? (All sincere questions 😄)

@alan-agius4
Copy link
Collaborator Author

Once you're using child processes, is there a reason to use worker threads?

Since we transfer large data between the main and "worker" process and vice-versa which can easily several megabytes. This should make them faster than child processes, From my understanding thread workers are lightweight, generally faster, and consume fewer resources than forked processes. Furthermore, libraries like Piscina streamline the management of worker threads, offering a more user-friendly pool management APIs.

Along those lines, why spawn X number of child processes which then spawn Y number of worker threads? Why not just fork Y number of child processes directly?

You are correct, an alternative solution might be more optimal, but my primary goal was to minimize changes introduced in a patch version.

I was assuming that threads would be faster and the only question was how much, but then I did some searching and found this: https://github.com/orgs/nodejs/discussions/44264.

I downloaded the benchmark and they are pretty close.

> node-workers
node:worker_threads benchmark: Processing 3000 files with 6 workers

Average time to process 3000 files:  8.062776568001508
Benchmark elapsed time:  40.31395083197951 



> node-forks
> node node/run-forks.mjs

node:child_process.fork benchmark: Processing 3000 files with 6 forks

Average time to process 3000 files:  8.928233420807123
Benchmark elapsed time:  44.641243622004986 

I’m curious to see if your test case shows worker threads to be faster, once you resolve the issues on this branch. If this branch is faster, would you want to keep this new implementation even if worker threads became an option again? If not, why not? (All sincere questions 😄)

That's a good question, if forked process turns out to be faster, likely we'll go with them. After all, our goal is always to reduce build times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action: review The PR is still awaiting reviews from at least one requested reviewer area: build & ci Related the build and CI infrastructure of the project state: blocked on upstream target: patch This PR is targeted for the next patch release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Node.js v22.2 causes ng build hangs up
2 participants