raftstore: optimize the batching mechanism when enabling async-ios. #17029

LykxSassinator · 2024-05-17T03:13:26Z

What is changed and how it works?

Issue Number: Ref #16907

What's Changed:

Previously, ##16906 had already lowered the compression threshold to alleviate the strain on IO resources. However, it was found that this configuration adjustment had limited impact on reducing IO resource costs.

Therefore, this PR introduces a simple algorithm (using sliding window) to track the real-time size of WriteBatch and initiate a yield operation if the WriteBatch size is under the desired threshold. This mechanism aims to increase the WriteBatch size as needed, a strategy that has proven effective in saving IO resources.

Introduce a simple algorithm to make the batching mechanism on WriteBatch larger as expected,
for reducing the cost on IO resources.

Related changes

PR to update pingcap/docs/pingcap/docs-cn:
Need to cherry-pick to the release branch

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

By testing on oltp-read-write workload:

It can greatly reduce the cost of IOPS, at most 30%. And relative bandwidth costs can be reduced from 260 MB/s to ~170 MB/s, which will have significant effects on reducing the IO resources cost on Cloud env.
The performance degression is low as expected < 1%, introduced by the new batching mechanism.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Release note

Introduce a simple algorithm to make the batching mechanism on WriteBatch larger as expected,
for saving the cost on IO resources.

Signed-off-by: lucasliang <[email protected]>

ti-chi-bot · 2024-05-17T03:13:29Z

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

ti-chi-bot · 2024-05-17T03:13:29Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Signed-off-by: lucasliang <[email protected]>

ti-chi-bot · 2024-06-11T13:04:27Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from lykxsassinator, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: lucasliang <[email protected]>

…l of spin. Signed-off-by: lucasliang <[email protected]>

Signed-off-by: lucasliang <[email protected]>

LykxSassinator · 2024-06-19T08:02:18Z

/test

ti-chi-bot · 2024-06-19T08:02:20Z

@LykxSassinator: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

/test pull-unit-test

Use /test all to run all jobs.

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

LykxSassinator · 2024-06-19T08:02:37Z

/test pull-unit-test

Signed-off-by: lucasliang <[email protected]>

glorv · 2024-06-20T08:16:47Z

An intuitive idea: What about move write task batch to raft fsm side. Instead of send a write message after handle each region, we can first cache region write message at the PollCtx, and after handle several regions, when the total write msg size exceeds a certain threshold or the current poll round is over, then we send the cached write tasks to the write thread. It seems more easier to handle this batch logic at the raft fsm side than the write io thread side.
/cc @Connor1996 @overvenus @zhangjinpeng87 What do you think?

zhangjinpeng87 · 2024-06-21T00:29:03Z

components/raftstore/src/store/async_io/write.rs

+/// If the batch size is smaller than the threshold, it will return a
+/// recommended yield duration for the caller as a hint to wait for more writes.
+/// The yield interval is calculated based on the trend of the change of the
+/// batch size. The range of the trend is [0.5, 2.0]. If the batch size is


Can you explain what does 2.0 mean and what does 0.5 mean? What is the quantitive relationship with wait time?

Just a heuristic ratio according to testing records, reflecting that the time of yield ranges from [yield_time * 0.5, yield_time * 2.0] nanoseconds. By default, the yield_time is 50 nanoseconds. So, the ranges of the whole yielding is [25 nanoseconds, 100 nanoseconds].

components/raftstore/src/store/async_io/write.rs

Connor1996 · 2024-06-21T07:47:41Z

An intuitive idea: What about move write task batch to raft fsm side. Instead of send a write message after handle each region, we can first cache region write message at the PollCtx, and after handle several regions, when the total write msg size exceeds a certain threshold or the current poll round is over, then we send the cached write tasks to the write thread. It seems more easier to handle this batch logic at the raft fsm side than the write io thread side. /cc @Connor1996 @overvenus @zhangjinpeng87 What do you think?

Looks promising, it's just how sync does that batch the writes in one poll. Maybe you can make a quick poc.

Signed-off-by: lucasliang <[email protected]>

components/raftstore/src/store/config.rs

zhangjinpeng87 · 2024-06-21T21:36:25Z

An intuitive idea: What about move write task batch to raft fsm side. Instead of send a write message after handle each region, we can first cache region write message at the PollCtx, and after handle several regions, when the total write msg size exceeds a certain threshold or the current poll round is over, then we send the cached write tasks to the write thread. It seems more easier to handle this batch logic at the raft fsm side than the write io thread side. /cc @Connor1996 @overvenus @zhangjinpeng87 What do you think?

@LykxSassinator @glorv This is a good idea. I think we can merge current solution first because it is well tested and has promising benefits of reducing IOPS and IO bandwidth, and then evaluate the new idea. What do you think?

glorv

rest LGTM

components/raftstore/src/store/async_io/write.rs

components/tikv_util/src/time.rs

Signed-off-by: lucasliang <[email protected]>

LykxSassinator · 2024-06-24T07:30:25Z

Based on the most recent performance comparisons with version 8.1.0 (default LTS version), Most typical workloads have passed the testing standards. However, it appears that the larger batch flushing introduced by this PR may lead to performance regressions in the oltp_update_non_index workload (The decreasing trend of IOPS is > 50% as expected, but with ~8% performance regression).

Therefore, this PR will be put on hold until a more suitable configuration for optimizing the WriteBatch size is identified.

Signed-off-by: lucasliang <[email protected]>

raftstore: optimize the batching mechanism when enabling async-ios.

1baa373

Signed-off-by: lucasliang <[email protected]>

ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress do-not-merge/release-note-label-needed size/S labels May 17, 2024

LykxSassinator added 6 commits May 21, 2024 10:45

Bugfix.

69168f1

Signed-off-by: lucasliang <[email protected]>

Merge branch 'master' into opt_raft_batching

5cf63ab

Merge branch 'master' into opt_raft_batching

ea16b8d

Supply extra configuration for manually setting the size of thd.

2a95882

Signed-off-by: lucasliang <[email protected]>

Merge branch 'master' into opt_raft_batching

2457e98

Add time wait when looping.

0f92545

Signed-off-by: lucasliang <[email protected]>

ti-chi-bot bot added the dco-signoff: yes label Jun 11, 2024

ti-chi-bot bot added size/M and removed size/S labels Jun 11, 2024

Introduce BatchRecorder as the dynamic controller.

cd63b82

Signed-off-by: lucasliang <[email protected]>

ti-chi-bot bot added size/L and removed size/M labels Jun 12, 2024

LykxSassinator added 5 commits June 13, 2024 10:58

Add extra config "raft_write_batch_size_spin" for setting the interva…

965f7d3

…l of spin. Signed-off-by: lucasliang <[email protected]>

Polish codes.

4892a02

Signed-off-by: lucasliang <[email protected]>

Replace the sleep with yield.

86057ec

Signed-off-by: lucasliang <[email protected]>

Reduce the looping count of spin to 1.

cfe57d7

Signed-off-by: lucasliang <[email protected]>

Replace the unit of spin, mircos, with nanos.

8949863

Signed-off-by: lucasliang <[email protected]>

LykxSassinator marked this pull request as ready for review June 19, 2024 02:50

ti-chi-bot bot removed the do-not-merge/work-in-progress label Jun 19, 2024

LykxSassinator added 2 commits June 19, 2024 10:55

Merge branch 'master' into opt_raft_batching

8fcd3b2

Signed-off-by: lucasliang <[email protected]>

Polish codes.

39bf3df

Signed-off-by: lucasliang <[email protected]>

ti-chi-bot bot removed the do-not-merge/needs-linked-issue label Jun 19, 2024

ti-chi-bot bot added release-note and removed do-not-merge/release-note-label-needed labels Jun 19, 2024

Fix testcases.

8e742d8

Signed-off-by: lucasliang <[email protected]>

LykxSassinator requested review from overvenus, glorv and Connor1996 June 19, 2024 14:22

zhangjinpeng87 reviewed Jun 21, 2024

View reviewed changes

Connor1996 reviewed Jun 21, 2024

View reviewed changes

components/raftstore/src/store/async_io/write.rs Outdated Show resolved Hide resolved

LykxSassinator added 2 commits June 21, 2024 17:27

Polish codes.

d3c0074

Signed-off-by: lucasliang <[email protected]>

Fix clppy errors.

ae761e6

Signed-off-by: lucasliang <[email protected]>

zhangjinpeng87 reviewed Jun 21, 2024

View reviewed changes

components/raftstore/src/store/config.rs Outdated Show resolved Hide resolved

glorv reviewed Jun 24, 2024

View reviewed changes

components/raftstore/src/store/async_io/write.rs Outdated Show resolved Hide resolved

components/raftstore/src/store/async_io/write.rs Outdated Show resolved Hide resolved

components/tikv_util/src/time.rs Show resolved Hide resolved

Address comments.

583bdf2

Signed-off-by: lucasliang <[email protected]>

LykxSassinator added the do-not-merge/hold label Jun 24, 2024

LykxSassinator added 3 commits July 1, 2024 21:25

Replace yield with spin.

5b0476b

Signed-off-by: lucasliang <[email protected]>

Fix ratio.

10d31e7

Signed-off-by: lucasliang <[email protected]>

Fix lint errors.

6ee3e51

Signed-off-by: lucasliang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

raftstore: optimize the batching mechanism when enabling async-ios. #17029

raftstore: optimize the batching mechanism when enabling async-ios. #17029

LykxSassinator commented May 17, 2024 •

edited

Loading

ti-chi-bot bot commented May 17, 2024

ti-chi-bot bot commented May 17, 2024

ti-chi-bot bot commented Jun 11, 2024

LykxSassinator commented Jun 19, 2024

ti-chi-bot bot commented Jun 19, 2024

LykxSassinator commented Jun 19, 2024

glorv commented Jun 20, 2024 •

edited

Loading

zhangjinpeng87 Jun 21, 2024

LykxSassinator Jun 21, 2024

Connor1996 commented Jun 21, 2024

zhangjinpeng87 commented Jun 21, 2024

glorv left a comment

LykxSassinator commented Jun 24, 2024 •

edited

Loading

raftstore: optimize the batching mechanism when enabling async-ios. #17029

Are you sure you want to change the base?

raftstore: optimize the batching mechanism when enabling async-ios. #17029

Conversation

LykxSassinator commented May 17, 2024 • edited Loading

What is changed and how it works?

Related changes

Check List

Release note

ti-chi-bot bot commented May 17, 2024

ti-chi-bot bot commented May 17, 2024

ti-chi-bot bot commented Jun 11, 2024

LykxSassinator commented Jun 19, 2024

ti-chi-bot bot commented Jun 19, 2024

LykxSassinator commented Jun 19, 2024

glorv commented Jun 20, 2024 • edited Loading

zhangjinpeng87 Jun 21, 2024

Choose a reason for hiding this comment

LykxSassinator Jun 21, 2024

Choose a reason for hiding this comment

Connor1996 commented Jun 21, 2024

zhangjinpeng87 commented Jun 21, 2024

glorv left a comment

Choose a reason for hiding this comment

LykxSassinator commented Jun 24, 2024 • edited Loading

LykxSassinator commented May 17, 2024 •

edited

Loading

glorv commented Jun 20, 2024 •

edited

Loading

LykxSassinator commented Jun 24, 2024 •

edited

Loading