feat(wal): reduce concurrent conflicts between block write operations and poll operations #1554

CLFutureX · 2024-07-11T08:04:25Z

issue

…and poll operations (AutoMQ#1550)

CLAassistant · 2024-07-11T08:08:47Z

All committers have signed the CLA.

Chillax-0v0 · 2024-07-11T08:14:47Z

Hello, @RapperCL, you can update your code within the same PR; there's no need to create multiple PRs.

s3stream/src/main/java/com/automq/stream/s3/wal/impl/block/SlidingWindowService.java

Chillax-0v0 · 2024-07-11T08:45:15Z

We have provided a tool called "WriteBench" for testing WAL performance. You can use it to compare the performance before and after the modifications to confirm their effectiveness.

CLFutureX · 2024-07-11T08:48:01Z

We have provided a tool called "WriteBench" for testing WAL performance. You can use it to compare the performance before and after the modifications to confirm their effectiveness.

Okay, I'll find some time later to test it.

…and poll operations (AutoMQ#1550)

Chillax-0v0 · 2024-07-11T08:57:28Z

#1550 (comment)

…and poll operations (AutoMQ#1550)

s3stream/src/main/java/com/automq/stream/s3/wal/impl/block/SlidingWindowService.java

…and poll operations (AutoMQ#1550)

… and poll operations (AutoMQ#1550)

CLFutureX · 2024-07-13T15:49:39Z

We have provided a tool called "WriteBench" for testing WAL performance. You can use it to compare the performance before and after the modifications to confirm their effectiveness.

Hey, I conducted some simple local tests based on WriteBench on my side, and the test results are basically consistent with the expectations.：The performance improvement becomes more evident as the concurrency level increases. Here are the test data for two different scenarios:
Current local computer environment: 8c, 32GB , 1TB SSD (KXG60ZNV1T02 by KIOXIA). (There are other applications running locally, so the test results can be used as a reference.)

superhx · 2024-07-15T01:06:32Z

This PR did indeed greatly improve the performance of WAL in many competitive scenarios.

However, it also makes the correctness of WAL in a multi-concurrent context difficult for humans to understand. Considering that the number of concurrent writes in reality is usually consistent with the number of CPU cores, the current locking model is not a bottleneck for WAL performance. It is not recommended to introduce a complex locking model to avoid increasing maintenance costs.

Is it possible to have better thread & concurrency models in the future for WAL to have both good performance and better correctness guarantees? (For example, single-thread or multi single-thread)

CLFutureX · 2024-07-15T03:04:42Z

This PR did indeed greatly improve the performance of WAL in many competitive scenarios.

However, it also makes the correctness of WAL in a multi-concurrent context difficult for humans to understand. Considering that the number of concurrent writes in reality is usually consistent with the number of CPU cores, the current locking model is not a bottleneck for WAL performance. It is not recommended to introduce a complex locking model to avoid increasing maintenance costs.

Is it possible to have better thread & concurrency models in the future for WAL to have both good performance and better correctness guarantees? (For example, single-thread or multi single-thread)

In fact, previously, i had considered optimizing with multiple single-threaded models, but given the strong ordering guarantees required in the current scenario, it's difficult to isolate concurrent competition through this approach.

I understand that optimizing with multiple single-threaded models is not suitable for scenarios that globally guarantee ordering, but is more suited to scenarios where local ordering is required.

Based on the current write design, it is necessary to ensure the ordering of blocks during writing as well as the correctness of the windowData.startOffset update operations. This, in turn, requires ensuring the ordering of writeBlocks, and subsequently, the ordering of poll operations.

This leads to the need for concurrent safety in managing competition among write operations, between write and poll operations, between poll and IO operations, as well as among IO operations.

Previously, global ordering was ensured through blockLock.

After the current optimization: Within write operations, blockLock is used; while for the competition between write and poll operations, between poll and IO operations, and among IO operations, pollBlockLock is introduced.

Based on the multiple single-threaded model, potential optimization points include:

Optimizing the IO thread pool into a multiple single-threaded model to reduce concurrent competition among IO threads for the same blocking queue. However, considering the current design of the poll thread with frequency control and batching, the competition for the blocking queue may not be intense.

Additionally, the introduction of pollBlockLock in the current pull request (PR), based on the principle of lock splitting, does not significantly increase complexity. Conceptually, it serves as a lock for poll nodes and does not conflict with existing concepts. In practice, upon closer inspection, this lock merely replaces the original BlockLock in its places of usage.

… and poll operations (AutoMQ#1550)

CLFutureX · 2024-07-17T02:12:15Z

@Chillax-0v0 @superhx PTAL
The following new changes have been made:

Not only optimizations in performance but also enhancements in the threading model.

Removal of Additional Lock: The newly added lock, pollBlockLock, has been removed.
Single-Threaded Model for Poll Operations: Based on the designed wakeup mechanism, the poll operation has been completely decoupled from the write operation. The write thread no longer participates in the poll operation, ensuring a pure single-threaded model for poll operations.
Updating startOffset for IO Operations: The update process for startOffset is now handled through a combination of volatile variables and a concurrent-safe queue. This decoupling separates IO operations from both poll and write operations.

Adjusted Threading Model: The reactor master-slave threading model is now implemented as follows:

Master Thread: poll thread.
Slave Threads: IO threads.

feat(wal):reduce concurrent conflicts between block write operations …

cb307e8

…and poll operations (AutoMQ#1550)

CLFutureX requested review from superhx, SCNieh, ShadowySpirits and Chillax-0v0 as code owners July 11, 2024 08:04

Chillax-0v0 changed the title ~~feat(wal):reduce concurrent conflicts between block write operations and poll operations (#1550)~~ feat(wal): reduce concurrent conflicts between block write operations and poll operations (#1550) Jul 11, 2024

Chillax-0v0 changed the title ~~feat(wal): reduce concurrent conflicts between block write operations and poll operations (#1550)~~ feat(wal): reduce concurrent conflicts between block write operations and poll operations Jul 11, 2024

This comment was marked as resolved.

Sign in to view

Chillax-0v0 reviewed Jul 11, 2024

View reviewed changes

feat(wal):reduce concurrent conflicts between block write operations …

1ccbeff

…and poll operations (AutoMQ#1550)

feat(wal):reduce concurrent conflicts between block write operations …

b46383f

…and poll operations (AutoMQ#1550)

Chillax-0v0 reviewed Jul 12, 2024

View reviewed changes

s3stream/src/main/java/com/automq/stream/s3/wal/impl/block/SlidingWindowService.java Outdated Show resolved Hide resolved

CLFutureX added 3 commits July 12, 2024 11:16

feat(wal):reduce concurrent conflicts between block write operations …

0847d72

…and poll operations (AutoMQ#1550)

feat(wal):reduce concurrent conflicts between block write operations …

3cc61aa

…and poll operations (AutoMQ#1550)

feat(wal): reduce concurrent conflicts between block write operations…

8ffec25

… and poll operations (AutoMQ#1550)

Chillax-0v0 previously approved these changes Jul 12, 2024

View reviewed changes

feat(wal): reduce concurrent conflicts between block write operations…

12b517f

… and poll operations (AutoMQ#1550)

CLFutureX dismissed Chillax-0v0’s stale review via 12b517f July 17, 2024 02:10

CLFutureX closed this Aug 15, 2024

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(wal): reduce concurrent conflicts between block write operations and poll operations #1554

feat(wal): reduce concurrent conflicts between block write operations and poll operations #1554

CLFutureX commented Jul 11, 2024

CLAassistant commented Jul 11, 2024 •

edited

Loading

Chillax-0v0 commented Jul 11, 2024

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Chillax-0v0 commented Jul 11, 2024

CLFutureX commented Jul 11, 2024

Chillax-0v0 commented Jul 11, 2024

CLFutureX commented Jul 13, 2024 •

edited

Loading

superhx commented Jul 15, 2024

CLFutureX commented Jul 15, 2024 •

edited

Loading

CLFutureX commented Jul 17, 2024

feat(wal): reduce concurrent conflicts between block write operations and poll operations #1554

feat(wal): reduce concurrent conflicts between block write operations and poll operations #1554

Conversation

CLFutureX commented Jul 11, 2024

CLAassistant commented Jul 11, 2024 • edited Loading

Chillax-0v0 commented Jul 11, 2024

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Chillax-0v0 commented Jul 11, 2024

CLFutureX commented Jul 11, 2024

Chillax-0v0 commented Jul 11, 2024

CLFutureX commented Jul 13, 2024 • edited Loading

superhx commented Jul 15, 2024

CLFutureX commented Jul 15, 2024 • edited Loading

CLFutureX commented Jul 17, 2024

CLAassistant commented Jul 11, 2024 •

edited

Loading

CLFutureX commented Jul 13, 2024 •

edited

Loading

CLFutureX commented Jul 15, 2024 •

edited

Loading