fix: dynamic batch size for tx lookup stage #3134

onbjerg · 2023-06-13T23:32:09Z

We had some perf regressions on the tx lookup stage. A quick fix was implemented in #3128, but I think this fixes the underlying issue itself.

Essentially, we split the hashing work into batches. The size of these batches were determined by 100_000 / num_cpus. Assume we are on a 16 core machine and the commit threshold of 50_000 was retained.

This gives us:

Batch size: circa 6k
Number of batches: circa 8

In other words, we are idling on half of our cores.

I think the reason #3128 alleviated this is because it would give us:

Batch size: circa 6k
Number of batches: circa 830

A more reasonable fix is to simply split the amount of available work across all threads evenly, i.e. if your commit threshold is 50k, then the batch size would be 50k / num_cpus.

This also brings the batching logic in line with the sender recovery stage.

codecov · 2023-06-13T23:54:15Z

Codecov Report

Merging #3134 (a0668d0) into main (39c6b22) will decrease coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #3134      +/-   ##
==========================================
- Coverage   70.20%   70.14%   -0.06%     
==========================================
  Files         524      524              
  Lines       69032    69036       +4     
==========================================
- Hits        48462    48427      -35     
- Misses      20570    20609      +39

Flag	Coverage Δ
integration-tests	`16.85% <0.00%> (-0.01%)`	⬇️
unit-tests	`65.15% <100.00%> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
crates/stages/src/stages/tx_lookup.rs	`77.55% <100.00%> (+0.08%)`	⬆️

... and 8 files with indirect coverage changes

gakonst · 2023-06-14T06:50:13Z

will be superseded by https://github.com/paradigmxyz/reth/tree/onbjerg/rm-txlookup-stage

onbjerg · 2023-06-15T05:22:59Z

@gakonst above branch seems unpopular, although anecdotally it does slow down bodies, but is still faster than having the lookup stage be separate. We can merge this w/o consideration for the above branch

gakonst

makes sense

fix: dynamic batch size for tx lookup stage

b947f21

onbjerg added A-staged-sync Related to staged sync (pipelines and stages) C-perf A change motivated by improving speed, memory usage or disk footprint labels Jun 13, 2023

onbjerg requested review from rkrasiuk and shekhirin as code owners June 13, 2023 23:32

fix: borrow checker

cbe7d44

chore: prevent div by 0

a0668d0

gakonst approved these changes Jun 17, 2023

View reviewed changes

gakonst added this pull request to the merge queue Jun 17, 2023

Merged via the queue into main with commit e252cd6 Jun 17, 2023

gakonst deleted the onbjerg/txlookup-batch-size branch June 17, 2023 01:37

joshieDo pushed a commit that referenced this pull request Jun 17, 2023

fix: dynamic batch size for tx lookup stage (#3134)

1fee36a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: dynamic batch size for tx lookup stage #3134

fix: dynamic batch size for tx lookup stage #3134

onbjerg commented Jun 13, 2023 •

edited

Loading

codecov bot commented Jun 13, 2023 •

edited

Loading

gakonst commented Jun 14, 2023

onbjerg commented Jun 15, 2023

gakonst left a comment

fix: dynamic batch size for tx lookup stage #3134

fix: dynamic batch size for tx lookup stage #3134

Conversation

onbjerg commented Jun 13, 2023 • edited Loading

codecov bot commented Jun 13, 2023 • edited Loading

Codecov Report

gakonst commented Jun 14, 2023

onbjerg commented Jun 15, 2023

gakonst left a comment

Choose a reason for hiding this comment

onbjerg commented Jun 13, 2023 •

edited

Loading

codecov bot commented Jun 13, 2023 •

edited

Loading