-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many peers disconnect when lots of transactions in a block are processed #345
Comments
Our current plan is to solve this by introducing a separate thread for the block processing. The network I/O thread will communicate with it through an |
This is an embarrassingly parallel scenario, so it's worth using workers in a thread pool. |
It's not trivially embarrasingly parallel, because one of the things EVM executions spend most time on is fetching and updating account states, contract code, storage slots. That's only parallel if the database accesses are parallel, and although the database APIs can be called in multiple threads, there's a lot of synchronisation we don't see going on. Still, it's possible to cache temporarily written states locally in a block-processing thread, so it's only database reads that need run in parallel in this scheme. |
The plan has changed a little, away from using a threadpool perhaps. The Nimbus EVM is now almost at a point where we can run it in an async manner under Chronos. The same mechanism as used to reduce the stack usage, ie. a continuation, is used to suspend EVM execution when the host says it is waiting for data. The Because all the suspension points occur outside the EVM (the "host" side of EVMC), all the suspension logic is confined to those points as well, except for the EVM's ability to be told it must pause on certain operations, and return to its caller with a continuation. In the stack-reduction patches, the EVM always uses a continuation for nested calls. In the async flavour it only uses a continuation when EVMC host operations reply with a "suspend now" flag. This is a bit different from Chronos, where async functions always return a By itself, this is enough to support concurrent EVMs without using threads, and it might be that EVMs always make enough calls to fetch storage data that we can add extra suspensions even when the data is in memory, just to allow the network processes to keep up. However, if they don't pause often enough, the next step is to make the EVM pre-emptive as well, so it always checks whether enough time has passed for a suspension. A timer may be needed to set this flag, but the current idea is to maintain an artificially capped gas limit, which when triggered causes a pre-emption time check or simply yields, and then is replenished with more from the real gas limit. Modifying the gas limit in this way provides the check for free in the interpreter loop, and because gas is designed to reflect performance, it may well be a good enough approximation to a pre-emption timer without requiring a timer thread or signal to be used. The preference for trying async EVM instead of running it in a threadpool came out of discussions a few weeks ago with @zah and @arnetheduck. It's not obvious this is better than threads (and moving the database and state cache to threads etc), not least because threads can use spare CPU cores. However, there are advantages to either approach and "small devices", with Nimbus confined to a single core for CPU work, was a motivation for having the async option to try out. Having said that, there is still a thread pool in the plan. We need a threadpool to run concurrent third party EVMs (as a plugin via EVMC). That means as a side-effect of supporting third-party EVMs for Berlin HF sync we will still end up, eventually, with a threadpool option for Nimbus EVM. But with async falling out nicely from the current architecture, and threadpool EVM being a bit more complicated than spawning parallel threads due to the database aspect, async is probably the mode that will be working first. |
This combines two things, a C stack usage change with EVM nested calls via EVMC, and changes to host call tracing. Feature-wise, the tracing is improved: - Storage keys and values are make more sense. - The message/result/context objects are shown with all relevant fields. - `call` trace is split into entry/exit, so these can be shown around the called contract's operations, instead of only showing the `call` parameters after the nested call is finished. - Nested calls are indented, which helps to highlight the flow. - C stack usage considerably reduced in nested calls when more functionality is enabled (either tracing here, or other things to come). This will seem like a minor patch, but C stack usage was the real motivation, after plenty of time in the debugger. Nobody cares about stack when `showTxCalls` (you can just use a big stack when debugging). But these subtle changes around the `call` path were found to be necessary for passing all tests when the EVMC nested call code is completed, and that's a prerequisite for many things: async EVM, dynamic EVM, Beam Sync, and to fix #345. Signed-off-by: Jamie Lokier <[email protected]>
Problem as mentioned before, ideally we can offload this work. I did not debug this in dept, but I assume it is because we simply do no respond on certain requests, so the peers flags us as useless.
Created an issue here to remain aware that this happens.
The text was updated successfully, but these errors were encountered: