Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: reduce contention in Notify #5503

Merged
merged 7 commits into from
Apr 19, 2023
Merged

Conversation

satakuma
Copy link
Member

@satakuma satakuma commented Feb 25, 2023

This PR replaces the notification field used by waiters in Notify with an atomic variable to reduce lock contention when multiple pending Notified futures are completing.

Motivation

The typical lifecycle of a Notified future looks like this:

  1. It gets polled for the first time, and it is inserted into the waiting list.
  2. It receives a notification from a notify_one or notify_waiters call and is removed from the waiting list.
  3. The notifying function calls wake() on the future's waker.
  4. The future gets polled again by the executor, consumes the notification and returns Ready.

Currently, step 4 requires acquiring the lock to read the notification field and the lock is released immediately after. Using an atomic variable here should significantly reduce lock contention, especially when multiple futures get notified at once by a notify_waiters call.

Benchmarks

Simple benchmarks on my machine (Linux 6.1.11, 4-core x86_64 developer laptop) show improved performance by 15-35%.

Click to expand results

master (c894069)

test notify_one<10>      ... bench:     128,130 ns/iter (+/- 18,168)
test notify_one<50>      ... bench:     226,002 ns/iter (+/- 20,095)
test notify_one<100>     ... bench:     199,244 ns/iter (+/- 18,873)
test notify_one<200>     ... bench:     210,647 ns/iter (+/- 28,222)
test notify_one<500>     ... bench:     218,932 ns/iter (+/- 29,464)

test notify_waiters<10>  ... bench:     289,161 ns/iter (+/- 25,546)
test notify_waiters<50>  ... bench:     248,969 ns/iter (+/- 47,217)
test notify_waiters<100> ... bench:     246,124 ns/iter (+/- 51,925)
test notify_waiters<200> ... bench:     250,637 ns/iter (+/- 44,959)
test notify_waiters<500> ... bench:     245,787 ns/iter (+/- 62,522)

with atomic notification:

test notify_one<10>      ... bench:     112,035 ns/iter (+/- 18,930)
test notify_one<50>      ... bench:     153,866 ns/iter (+/- 21,207)
test notify_one<100>     ... bench:     150,171 ns/iter (+/- 36,634)
test notify_one<200>     ... bench:     154,960 ns/iter (+/- 12,386)
test notify_one<500>     ... bench:     152,904 ns/iter (+/- 17,885)

test notify_waiters<10>  ... bench:     210,752 ns/iter (+/- 23,719)
test notify_waiters<50>  ... bench:     161,876 ns/iter (+/- 7,037)
test notify_waiters<100> ... bench:     157,448 ns/iter (+/- 25,909)
test notify_waiters<200> ... bench:     167,677 ns/iter (+/- 22,640)
test notify_waiters<500> ... bench:     290,953 ns/iter (+/- 39,285)

With many contending waiters (>500) benchmarks show degraded performance, but I did some profiling and the bottleneck seems to be somewhere in the scheduler.

@github-actions github-actions bot added the R-loom Run loom tests on this PR label Feb 25, 2023
@satakuma satakuma added A-tokio Area: The main tokio crate M-sync Module: tokio/sync labels Feb 25, 2023
@Darksonn
Copy link
Contributor

This PR is rather large. Will take me a bit until I can review it.

@satakuma
Copy link
Member Author

satakuma commented Feb 26, 2023

Sure, no problem. I've pushed an update which slightly reduces the size of this change. I can write down some key points to make the review easier.

Adding the atomic variable is quite straightforward, it boils down to adding atomic stores to notify_{one,waiters} functions and an atomic load to poll_notified. The reason why the diff ended up being so large is that I had to change the existing code to be correct aliasing-wise regarding the atomic field.

Previously, a Waiter lived inside an UnsafeCell and could be mutably borrowed after acquiring the lock which guards the waitlist. This PR avoids mutable borrows of Waiter and instead wraps the only mutable field (the waker) in an UnsafeCell to allow mutation. Technically, there is one, indirect mutable borrow of Waiter coming from poll's Pin<&mut Notified>, but that's a known issue of self-referential structs and this is safe because Waiter is !Unpin.

Waiters for the batch semaphore also use an atomic field for tracking the number of permits, and the code there is organized similarly to this PR. If this PR gets merged, I would like to also investigate if similar optimization can be applied in the IO driver.

tokio/src/sync/notify.rs Outdated Show resolved Hide resolved
tokio/src/sync/notify.rs Outdated Show resolved Hide resolved
Comment on lines 232 to 226
/// `true` if the notification has been assigned to this waiter.
notified: Option<NotificationType>,
/// Notification for this waiter. It is an enum and uses
/// `NOTIFICATION_{NONE, ONE, ALL}` values.
notification: AtomicUsize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code is ok, but the safety invariants invovled with this atomic should be documented more clearly.

* use wrapper type for the atomic notification field
* fix safety commments
* document the fact that the notification field is used
  to protect `waker` field
@satakuma
Copy link
Member Author

I've added some comments to better document the safety invariants. I've also moved the atomic variable into a wrapper type to use an enum instead of constants.

@hawkw hawkw self-requested a review March 17, 2023 16:29
Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed this again, and it LGTM.

@Darksonn Darksonn merged commit db54363 into master Apr 19, 2023
@Darksonn Darksonn deleted the satakuma/notify-contention branch April 19, 2023 11:07
crapStone pushed a commit to Calciumdibromid/CaBr2 that referenced this pull request Apr 29, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dependencies | minor | `1.27.0` -> `1.28.0` |
| [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dev-dependencies | minor | `1.27.0` -> `1.28.0` |

---

### Release Notes

<details>
<summary>tokio-rs/tokio</summary>

### [`v1.28.0`](https://github.com/tokio-rs/tokio/releases/tag/tokio-1.28.0): Tokio v1.28.0

[Compare Source](tokio-rs/tokio@tokio-1.27.0...tokio-1.28.0)

##### 1.28.0 (April 25th, 2023)

##### Added

-   io: add `AsyncFd::async_io` ([#&#8203;5542])
-   io: impl BufMut for ReadBuf ([#&#8203;5590])
-   net: add `recv_buf` for `UdpSocket` and `UnixDatagram` ([#&#8203;5583])
-   sync: add `OwnedSemaphorePermit::semaphore` ([#&#8203;5618])
-   sync: add `same_channel` to broadcast channel ([#&#8203;5607])
-   sync: add `watch::Receiver::wait_for` ([#&#8203;5611])
-   task: add `JoinSet::spawn_blocking` and `JoinSet::spawn_blocking_on` ([#&#8203;5612])

##### Changed

-   deps: update windows-sys to 0.48 ([#&#8203;5591])
-   io: make `read_to_end` not grow unnecessarily ([#&#8203;5610])
-   macros: make entrypoints more efficient ([#&#8203;5621])
-   sync: improve Debug impl for `RwLock` ([#&#8203;5647])
-   sync: reduce contention in `Notify` ([#&#8203;5503])

##### Fixed

-   net: support `get_peer_cred` on AIX ([#&#8203;5065])
-   sync: avoid deadlocks in `broadcast` with custom wakers ([#&#8203;5578])

##### Documented

-   sync: fix typo in `Semaphore::MAX_PERMITS` ([#&#8203;5645])
-   sync: fix typo in `tokio::sync::watch::Sender` docs ([#&#8203;5587])

[#&#8203;5065]: tokio-rs/tokio#5065

[#&#8203;5503]: tokio-rs/tokio#5503

[#&#8203;5542]: tokio-rs/tokio#5542

[#&#8203;5578]: tokio-rs/tokio#5578

[#&#8203;5583]: tokio-rs/tokio#5583

[#&#8203;5587]: tokio-rs/tokio#5587

[#&#8203;5590]: tokio-rs/tokio#5590

[#&#8203;5591]: tokio-rs/tokio#5591

[#&#8203;5607]: tokio-rs/tokio#5607

[#&#8203;5610]: tokio-rs/tokio#5610

[#&#8203;5611]: tokio-rs/tokio#5611

[#&#8203;5612]: tokio-rs/tokio#5612

[#&#8203;5618]: tokio-rs/tokio#5618

[#&#8203;5621]: tokio-rs/tokio#5621

[#&#8203;5645]: tokio-rs/tokio#5645

[#&#8203;5647]: tokio-rs/tokio#5647

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS42My4xIiwidXBkYXRlZEluVmVyIjoiMzUuNjQuMCJ9-->

Co-authored-by: cabr2-bot <[email protected]>
Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1875
Reviewed-by: crapStone <[email protected]>
Co-authored-by: Calciumdibromid Bot <[email protected]>
Co-committed-by: Calciumdibromid Bot <[email protected]>
icycrystal4 added a commit to weibocom/breeze that referenced this pull request May 17, 2023
icycrystal4 added a commit to weibocom/breeze that referenced this pull request May 18, 2023
icycrystal4 added a commit to weibocom/breeze that referenced this pull request May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-sync Module: tokio/sync R-loom Run loom tests on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants