Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more intra-L0 compaction when L0 is small #12214

Closed
wants to merge 8 commits into from

Conversation

cbi42
Copy link
Member

@cbi42 cbi42 commented Jan 9, 2024

Summary: introduce a new option intra_l0_compaction_size to allow more intra-L0 compaction when total L0 size is under a threshold. This option applies only to leveled compaction. It is enabled by default and set to max_bytes_for_level_base / max_bytes_for_level_multiplier only for atomic_flush users. When atomic_flush=true, it is more likely that some CF's total L0 size is small when it's eligible for compaction. This option aims to reduce write amplification in this case.

Test plan:

  • new unit test
  • benchmark:
TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=fillrandom --write_buffer_size=51200 --max_bytes_for_level_base=5242880 --level0_file_num_compaction_trigger=4 --statistics=1

main:
fillrandom   :     234.499 micros/op 4264 ops/sec 234.499 seconds 1000000 operations;    0.5 MB/s
rocksdb.compact.read.bytes COUNT : 1490756235
rocksdb.compact.write.bytes COUNT : 1469056734
rocksdb.flush.write.bytes COUNT : 71099011

branch:
fillrandom   :     128.494 micros/op 7782 ops/sec 128.494 seconds 1000000 operations;    0.9 MB/s
rocksdb.compact.read.bytes COUNT : 807474156
rocksdb.compact.write.bytes COUNT : 781977610
rocksdb.flush.write.bytes COUNT : 71098785

@hx235
Copy link
Contributor

hx235 commented Jan 9, 2024

I happened to notice #10865 - are these related?

@cbi42
Copy link
Member Author

cbi42 commented Jan 9, 2024

I happened to notice #10865 - are these related?

Yes, this is following #10865 (comment) and will solve this same problem.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 requested a review from ajkr January 9, 2024 23:59
Comment on lines 1245 to 1261
// For leveled compaction, when total L0 size is no more than this threshold,
// RocksDB will attempt intra-L0 compaction first before trying to compact
// L0 files down. This is used to reduce write amplification when L0 size
// can be very small.
//
// Note that this option does not affect the intra-L0 compaction that happens
// when there are too many L0 files.
//
// UINT64_MAX - 1 (0xfffffffffffffffe) is a special flag to allow RocksDB
// to pick default.
// 0 means disabled.
//
// Default:
// when atomic_flush = 0: 0
// when atomic_flush=1:
// max_bytes_for_level_base / max_bytes_for_level_multiplier
uint64_t intra_l0_compaction_size = 0xfffffffffffffffe;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we need a new option. Maybe.

What if the threshold is determined based on the actual size(Lbase)? For example, size(L0) < size(Lbase) / max_bytes_for_level_multiplier. Maybe scale down the limit even more. The intuition is we don't want to run L0->Lbase compactions that are very inefficient for write-amp. Does that approach let unit tests pass?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Updated to remove the option and use size(Lbase) / max(10, max_bytes_for_level_multiplier) as the threshold. It turns out that no unit test change is needed.

@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@cbi42
Copy link
Member Author

cbi42 commented Jan 19, 2024

I also added some db_bench result to the PR summary.

@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 requested a review from ajkr January 19, 2024 18:02
Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@facebook-github-bot
Copy link
Contributor

@cbi42 merged this pull request in 4b684e9.

mayuehappy pushed a commit to mayuehappy/frocksdb that referenced this pull request Jul 9, 2024
Summary:
introduce a new option `intra_l0_compaction_size` to allow more intra-L0 compaction when total L0 size is under a threshold. This option applies only to leveled compaction. It is enabled by default and set to `max_bytes_for_level_base / max_bytes_for_level_multiplier` only for atomic_flush users. When atomic_flush=true, it is more likely that some CF's total L0 size is small when it's eligible for compaction. This option aims to reduce write amplification in this case.

Pull Request resolved: facebook/rocksdb#12214

Test Plan:
- new unit test
- benchmark:
```
TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=fillrandom --write_buffer_size=51200 --max_bytes_for_level_base=5242880 --level0_file_num_compaction_trigger=4 --statistics=1

main:
fillrandom   :     234.499 micros/op 4264 ops/sec 234.499 seconds 1000000 operations;    0.5 MB/s
rocksdb.compact.read.bytes COUNT : 1490756235
rocksdb.compact.write.bytes COUNT : 1469056734
rocksdb.flush.write.bytes COUNT : 71099011

branch:
fillrandom   :     128.494 micros/op 7782 ops/sec 128.494 seconds 1000000 operations;    0.9 MB/s
rocksdb.compact.read.bytes COUNT : 807474156
rocksdb.compact.write.bytes COUNT : 781977610
rocksdb.flush.write.bytes COUNT : 71098785
```

Reviewed By: ajkr

Differential Revision: D52637771

Pulled By: cbi42

fbshipit-source-id: 4f2c7925d0c3a718635c948ea0d4981ed9fabec3
(cherry picked from commit 4b684e96b71cd6f1d50e29fae8b55c323ccd2869)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants