Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: dynamic batch size for tx lookup stage #3134

Merged
merged 3 commits into from
Jun 17, 2023
Merged

Conversation

onbjerg
Copy link
Member

@onbjerg onbjerg commented Jun 13, 2023

We had some perf regressions on the tx lookup stage. A quick fix was implemented in #3128, but I think this fixes the underlying issue itself.

Essentially, we split the hashing work into batches. The size of these batches were determined by 100_000 / num_cpus. Assume we are on a 16 core machine and the commit threshold of 50_000 was retained.

This gives us:

  • Batch size: circa 6k
  • Number of batches: circa 8

In other words, we are idling on half of our cores.

I think the reason #3128 alleviated this is because it would give us:

  • Batch size: circa 6k
  • Number of batches: circa 830

A more reasonable fix is to simply split the amount of available work across all threads evenly, i.e. if your commit threshold is 50k, then the batch size would be 50k / num_cpus.

This also brings the batching logic in line with the sender recovery stage.

@onbjerg onbjerg added A-staged-sync Related to staged sync (pipelines and stages) C-perf A change motivated by improving speed, memory usage or disk footprint labels Jun 13, 2023
@codecov
Copy link

codecov bot commented Jun 13, 2023

Codecov Report

Merging #3134 (a0668d0) into main (39c6b22) will decrease coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #3134      +/-   ##
==========================================
- Coverage   70.20%   70.14%   -0.06%     
==========================================
  Files         524      524              
  Lines       69032    69036       +4     
==========================================
- Hits        48462    48427      -35     
- Misses      20570    20609      +39     
Flag Coverage Δ
integration-tests 16.85% <0.00%> (-0.01%) ⬇️
unit-tests 65.15% <100.00%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
crates/stages/src/stages/tx_lookup.rs 77.55% <100.00%> (+0.08%) ⬆️

... and 8 files with indirect coverage changes

@gakonst
Copy link
Member

gakonst commented Jun 14, 2023

@onbjerg
Copy link
Member Author

onbjerg commented Jun 15, 2023

@gakonst above branch seems unpopular, although anecdotally it does slow down bodies, but is still faster than having the lookup stage be separate. We can merge this w/o consideration for the above branch

Copy link
Member

@gakonst gakonst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

@gakonst gakonst added this pull request to the merge queue Jun 17, 2023
Merged via the queue into main with commit e252cd6 Jun 17, 2023
@gakonst gakonst deleted the onbjerg/txlookup-batch-size branch June 17, 2023 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-staged-sync Related to staged sync (pipelines and stages) C-perf A change motivated by improving speed, memory usage or disk footprint
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants