How do I initialise a structure so it's shared between threads of the same bench iteration? #51

anko · 2024-05-02T17:02:04Z

I'm trying to benchmark a concurrent data structure, and I want to benchmark its read/write behaviour under thread contention. However, unlike all of the threaded examples in documentation, this structure's performance characteristics change as it is modified: internal parts of it are consumed or rearranged by different threads, so it needs to be constructed again for each run of the benchmark.

This means:

If the tested structure is made static, or initialised in the benchmark function body before calling divan::Bencher methods, then only the first iteration sees the structure as it was constructed. The other iterations see one which contents have been consumed by the first iteration, with almost no work left to benchmark.
```
#[divan::bench(threads=[1, 2, 4, 8, 16])]
fn benchmark_function(bencher: divan::Bencher) {
    static x: MyStruct = create_structure();
    bencher
        .bench(|| x.consume_contents());
}
```

If it is initialised in with_inputs, each thread gets its own copy of the whole structure, so they never contend.

#[divan::bench(threads=[1, 2, 4, 8, 16])]
fn benchmark_function(bencher: divan::Bencher) {
    bencher
        .with_inputs(|| create_structure())
        .bench_values(|x| x.consume_contents());
}

Either the structure is constructed once, then shared among all iterations (the first option), or constructed separately for each thread, and never shared (the second option). I need a way to make it constructed once per benchmark run, and shared only among threads that are part of the same benchmark run. Do I correctly understand that this is currently not possible using the threads option?

My current workaround is to start a const number of threads myself inside the with_inputs closure and have them wait at a std::sync::Barrier, then as part of the bench_local_values closure, release the Barrier and join the threads to time them:

#[divan::bench(consts = [1, 2, 4, 8, 16])]
fn benchmark_function<const THREADS: usize>(bencher: divan::Bencher) {
    use std::sync::{Arc, Barrier};
    bencher
        .with_inputs(|| -> (Vec<std::thread::JoinHandle<_>>, _) {
            let x: MyStruct = Arc::new(create_structure());
            let barrier = Arc::new(Barrier::new(THREADS + 1));
            let threads = (0..THREADS).map(|_| {
                let x = x.clone();
                let barrier = barrier.clone();
                std::thread::spawn(move || {
                    barrier.wait();
                    x.consume_contents();
                })
            }).collect();
            (threads, barrier)
        })
        .bench_local_values(|(threads, barrier)| {
            barrier.wait();
            for t in threads {
                t.join().unwrap();
            }
        });
}

This works, but there's a lot of code duplicating what I imagine Divan would do internally to implement the threads option.

I also see worse performance when benchmarking with 1 thread using this method than I do from an otherwise-identical benchmark with #[divan::bench(threads = [1])]. Probably because Divan doesn't use a Barrier when single-threaded. Which is smart, and another reason why I feel like this could be handled.

Am I missing a better existing way to do this?

If yes, could an example be added illustrating it?
If no, do you think this use-case could be handled by Divan?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I initialise a structure so it's shared between threads of the same bench iteration? #51

How do I initialise a structure so it's shared between threads of the same bench iteration? #51

anko commented May 2, 2024

How do I initialise a structure so it's shared between threads of the same bench iteration? #51

How do I initialise a structure so it's shared between threads of the same bench iteration? #51

Comments

anko commented May 2, 2024