Removed turn() API on the Runtime from 0.2.0-alpha.6 to 0.2.0 #1887

sdroege · 2019-12-03T11:27:50Z

In tokio 0.2.0-alpha.6 it was still possible to construct a "Runtime" yourself by taking tokio-reactor, tokio-timer and tokio-current-thread and putting them together. Then it could be run() or turn()ed.

With tokio 0.2.0 merging everything into a single crate and reorganizing everything this is not possible anymore, which makes it hard to port one of our projects from the alpha version to the stable version.

I should probably start by giving some context. The project in question is a GStreamer plugin, gst-plugin-threadshare that allows to use tokio as a scheduler to use fewer kernel threads for lower resource usage and higher throughput. You can also find a blog post by me with some numbers and more details.

Now the reason for putting together our own runtime here were the following

We want to throttle (per runtime) the number of calls to epoll() or similar, i.e. the reactor. By doing so we reduce the number of wakeups (and thus context switches), which is considerably reducing the CPU usage and increasing the throughput a lot. See my blog post for some details. Maybe this is a feature that would also be useful in tokio?
1.1 Because of the throttling it was necessary to implement our own timer infrastructure, as tokio's timers don't know anything about the throttling and would usually be triggered much later than needed. By knowing the throttling interval, our own timers would at most trigger half of the interval too early or too late, and not on average half the interval too late. Also back in tokio 0.1 it seemed like the tokio interval timers were actually drifting when throttled.
1.2 The custom timers implementation had to be wrapped around the calls to turn() and also needed a way to unpark() the reactor whenever the list of timers changed in a way that the next wakeup would be earlier.
We want to use a single thread for the whole runtime (executor, reactor, timers) instead of having it distributed over multiple threads. Reason is again resource usage and overhead from context switches. You can see from the numbers in my blog post that this also made quite some difference.

From what I can see, 2. is not necessary anymore nowadays with the basic_scheduler() feature of the runtime Builder. 1. is still necessary.

What would you suggest for moving forward with this? Adding such a throttling feature to tokio directly? Exposing ways to hook into the runtime behaviour for implementing this outside tokio again somehow? Anything else? :)

The text was updated successfully, but these errors were encountered:

carllerche · 2019-12-03T16:45:47Z

Interesting. Re 1. it would be useful to get some benchmarks together on tokio 0.2 to demonstrate this.

sdroege · 2019-12-03T16:50:49Z

Interesting. Re 1. it would be useful to get some benchmarks together on tokio 0.2 to demonstrate this.

I can prepare something, but that's nothing tokio really can mitigate without throttling calls to epoll(). The problem is simply if packets are arriving randomly, you'd wake up your threads all the time and process a small packet just to sleep again and call epoll() for basically every packet. While otherwise you'd be able to handle lots of stuff at once, only call epoll() once for a whole lot of packets. syscalls and context switches are expensive.

I'm unsure however how to prepare such a benchmark. I can show you that tokio 0.2.0-alpha.6 with throttling has a lot more throughput than tokio 0.2.0 stable without throttling, but that's apples and oranges :) I can also show tokio 0.2.0-alpha.6 with vs. without throttling.

carllerche · 2019-12-03T16:53:59Z

Just a little app that demonstrates a case where throttling is helpful. That would be something to experiment with.

carllerche · 2019-12-03T16:54:34Z

After docs, I plan on setting up a benchmark suite... stuff like ^^ that demonstrates "real world" patterns would be helpful to add.

sdroege · 2019-12-03T16:55:52Z

See my blog post I linked above, but I can prepare a new version of that just on top of tokio without other dependencies. That probably helps, and that throttling improves the situation then could be shown by simply adding a sleep() at a strategic place inside tokio.

I'll take a look at that later today or tomorrow.

sdroege · 2019-12-03T18:19:02Z

I have a small example, will clean up and put it up somewhere later. But 1000 UDP sockets, one packet every 20ms on each gives about 22% CPU with a single basic runtime and about 23% with two basic runtimes in separate threads. With throttling to call io::driver::park at most once every 20ms it gives around 16% with a single basic runtime and around 17% with two basic runtimes. This is only receiving 160 byte UDP packets and dropping them.

Compared to my benchmarks a 1.5 years ago, and with tokio 0.1 and additional overhead from GStreamer, these are very similar results. With even more sockets the effect will be more visible, I'll create a table with various results later.

Note: For both cases I changed the MAX_TASKS_PER_TICK in the basic scheduler to infinity, otherwise it would not handle all sockets per tick.

carllerche · 2019-12-03T18:43:40Z

I agree that there is probably some strategy we could use to throttle calls to the I/O driver.

sdroege · 2019-12-03T20:58:33Z

Code can be found here. Run with cargo run --bin sender [num-sockets] and cargo run --bin receiver [num-sockets] [num-threads].

My measurements before we slightly wrong, I implemented the throttling wrong. Patch can be found at the bottom.

Threads	Throttle	Sockets	CPU
X	0ms	1000	35%
1	0ms	1000	11%
2	0ms	1000	12%
1	20ms	1000	10%
2	20ms	1000	10%
X	0ms	2000	72%
1	0ms	2000	22%
2	0ms	2000	23%
1	20ms	2000	18%
2	20ms	2000	20%
X	0ms	4000	147%
1	0ms	4000	48%
2	0ms	4000	50%
1	20ms	4000	28%
2	20ms	4000	36%

The X is with the default runtime, creates 4 threads here.

Patch for throttling below. This does not consider the throttling for the timers (but should):

diff --git a/tokio/src/runtime/basic_scheduler.rs b/tokio/src/runtime/basic_scheduler.rs
index c674b961..c5427f20 100644
--- a/tokio/src/runtime/basic_scheduler.rs
+++ b/tokio/src/runtime/basic_scheduler.rs
@@ -71,6 +71,8 @@ struct LocalState<P> {
 
     /// Thread park handle
     park: P,
+
+    last_tick: Option<std::time::Instant>,
 }
 
 #[derive(Debug)]
@@ -110,7 +112,7 @@ where
                 pending_drop: task::TransferStack::new(),
                 unpark: Box::new(unpark),
             }),
-            local: LocalState { tick: 0, park },
+            local: LocalState { tick: 0, park, last_tick: None },
         }
     }
 
@@ -218,7 +220,7 @@ impl Spawner {
 
 impl SchedulerPriv {
     fn tick(&self, local: &mut LocalState<impl Park>) {
-        for _ in 0..MAX_TASKS_PER_TICK {
+        loop {
             // Get the current tick
             let tick = local.tick;
 
@@ -227,10 +229,7 @@ impl SchedulerPriv {
 
             let task = match self.next_task(tick) {
                 Some(task) => task,
-                None => {
-                    local.park.park().ok().expect("failed to park");
-                    return;
-                }
+                None => break,
             };
 
             if let Some(task) = task.run(&mut || Some(self.into())) {
@@ -240,9 +239,21 @@ impl SchedulerPriv {
             }
         }
 
+        if let Some(last_tick) = local.last_tick {
+            use std::thread;
+
+            let now = std::time::Instant::now();
+            let diff = now - last_tick;
+            const WAIT: std::time::Duration = std::time::Duration::from_millis(20);
+            if diff < WAIT {
+                thread::sleep(WAIT - diff);
+            }
+        }
+        local.last_tick = Some(std::time::Instant::now());
+
         local
             .park
-            .park_timeout(Duration::from_millis(0))
+            .park()
             .ok()
             .expect("failed to park");
     }

sdroege · 2019-12-04T20:21:45Z

I agree that there is probably some strategy we could use to throttle calls to the I/O driver.

Or alternatively it would be great if there was API that would allow to replace the runtime or parts of it with a custom runtime, like there was before :)

carllerche · 2019-12-04T20:24:46Z

I'm not against it. The permutation details need to be figured out.

Origin: tokio-rs#1887 (comment)

Darksonn · 2020-07-25T09:07:59Z

This appears related to #2443, and maybe also #1583, #2545.

carllerche · 2020-10-23T22:04:13Z

Closing due to inactivity.

For future reference, I am not necessarily against adding the ability to configure throttling, but I would like to see it demonstrated that doing it at the Tokio level provides measurable benefit over implementing batching logic in userland.

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 5, 2019

PoC for throttling (not including timers)

da3f8f7

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 6, 2019

PoC for throttling (not including timers)

02e4b8c

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 18, 2019

PoC for throttling (not including timers)

2267837

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 19, 2019

PoC for throttling (not including timers)

85a4f7d

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 19, 2019

PoC for throttling (not including timers)

a51d4c0

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 21, 2019

PoC for throttling (not including timers)

b391e09

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 22, 2019

PoC for throttling (not including timers)

d1ff6ee

Origin: tokio-rs#1887 (comment)

fengalin mentioned this issue Dec 22, 2019

rt: add optional throttling to BasicScheduler #2016

Closed

fengalin pushed a commit to fengalin/tokio that referenced this issue Dec 25, 2019

PoC for throttling (not including timers)

b358e01

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Jan 5, 2020

PoC for throttling (not including timers)

f7011f2

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Jan 8, 2020

PoC for throttling (not including timers)

7e251a4

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Feb 25, 2020

PoC for throttling (not including timers)

c5cf515

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Feb 28, 2020

PoC for throttling (not including timers)

2b70077

Origin: tokio-rs#1887 (comment)

fengalin pushed a commit to fengalin/tokio that referenced this issue Apr 11, 2020

PoC for throttling (not including timers)

22179c7

Origin: tokio-rs#1887 (comment)

carllerche closed this as completed Oct 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removed turn() API on the Runtime from 0.2.0-alpha.6 to 0.2.0 #1887

Removed turn() API on the Runtime from 0.2.0-alpha.6 to 0.2.0 #1887

sdroege commented Dec 3, 2019

carllerche commented Dec 3, 2019

sdroege commented Dec 3, 2019

carllerche commented Dec 3, 2019

carllerche commented Dec 3, 2019

sdroege commented Dec 3, 2019

sdroege commented Dec 3, 2019

carllerche commented Dec 3, 2019

sdroege commented Dec 3, 2019 •

edited

Loading

sdroege commented Dec 4, 2019

carllerche commented Dec 4, 2019

Darksonn commented Jul 25, 2020 •

edited

Loading

carllerche commented Oct 23, 2020

Removed turn() API on the Runtime from 0.2.0-alpha.6 to 0.2.0 #1887

Removed turn() API on the Runtime from 0.2.0-alpha.6 to 0.2.0 #1887

Comments

sdroege commented Dec 3, 2019

carllerche commented Dec 3, 2019

sdroege commented Dec 3, 2019

carllerche commented Dec 3, 2019

carllerche commented Dec 3, 2019

sdroege commented Dec 3, 2019

sdroege commented Dec 3, 2019

carllerche commented Dec 3, 2019

sdroege commented Dec 3, 2019 • edited Loading

sdroege commented Dec 4, 2019

carllerche commented Dec 4, 2019

Darksonn commented Jul 25, 2020 • edited Loading

carllerche commented Oct 23, 2020

sdroege commented Dec 3, 2019 •

edited

Loading

Darksonn commented Jul 25, 2020 •

edited

Loading