Make empty `Dict` only have space for 0 rather than 16 elements (and the minimum size for non-empty `Dict` down to 4). #52912

oscardssmith · 2024-01-15T19:26:49Z

This improves speed and reduces allocation for small Dicts.

#before
julia> @btime Dict{Int,Int}()
  46.815 ns (4 allocations: 448 bytes)

#after
julia> @btime Dict{Int, Int}()
  9.429 ns (1 allocation: 80 bytes)

The initialization to 16 before made sense when Dict was Vector backed since Vectors were slow to instantiate, but Memory is much faster).

vtjnash · 2024-01-16T00:27:00Z

Starting with zero might be much better, since it doesn't allocate at all?

oscardssmith · 2024-01-16T01:16:10Z

The 2 arguments against 0 are that it makes the resizing awkward (since otherwise resize always multiplies by 4/2) and that heuristically, we expect most Dicts to hold an element eventually.

vtjnash · 2024-01-16T01:21:07Z

counter-point: resize should already have a special case to make sure it allocates enough initially. And the initial sizehint!(Dict(), n) should become much cheaper, so union! and others become much cheaper

grandinj · 2024-01-16T08:23:54Z

just a drive by comment: The Java hotspot engineers did significant work profiling the Java ArrayList class, and the default initial capacity there is 10 - which should not be taken to mean that 10 is best, since that is an implementation-dependent number, but just that a non-zero number is probably best.

oscardssmith · 2024-01-16T12:48:57Z

that is good to know, but it is worth remembering that Java doesn't have tuples which will affect the distribution. we also have a weird optimization where length 0 Memory is especially fast since it doesn't have to allocate at all.

This improves speed for small Dicts. (the 16 before made sense when Dict was Vector backed since Vectors were slow to instantiate, but size Memory are much faster and have an extra fast path for size 0)

oscardssmith · 2024-01-21T17:36:19Z

I've now modified this to initialize to 0 elements as suggested. I think we should probably take a look at the nanosoldier benchmarks to make sure this doesn't cause any regressions, but other than that, I think this is ready to merge.

vtjnash · 2024-01-21T21:00:35Z

@nanosoldier runbenchmarks(!"scalar" && !"union" && !"array", vs=":master")

nanosoldier · 2024-01-21T23:08:41Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

oscardssmith · 2024-01-21T23:32:51Z

that's a lot of noise, but I don't think any of the regressions are real (somewhat hard to say though)

vtjnash · 2024-01-21T23:38:02Z

Maybe we'd see more impact for changing IdDict, since its default is pretty massive?

julia> IdDict().ht
32-element Memory{Any}:

oscardssmith · 2024-01-21T23:41:50Z

Note that it's also only 16 elements since it stores 1 list of pairs rather than 2 lists.

PallHaraldsson · 2024-01-22T11:36:31Z

Starting with zero might be much better, since it doesn't allocate at all?

Why is that relevant to anyone? Are 0-sized used much at all? I mean besides as a starting point. Which you can benchmark, but you would (almost?) always add something to it, and then you're just deferring the "whole operation" in context (what you really should what to be rather benchmarking?), and that combination would not be faster with 0-sized? Possibly slower?

The 2 arguments against 0 are that it makes the resizing awkward

You mean a special-case (making always slower) we want to avoid?

jakobnissen · 2024-01-22T13:16:09Z

I tested this PR on an application-like workload of mine, which uses dicts faily heavily with around 1/3 of runtime is spent in dict lookups, and at least 25% of runtime on get! to build a dictionary. None of my dicts ought to be empty, but it's likely that a large fraction of them have only one or two keys. The total runtime is unchanged on this PR vs 47d31ac (3.135 vs 3.131 seconds).

gbaraldi · 2024-01-22T13:57:15Z

@PallHaraldsson the reason for starting with 0 is that a 0 sized array is free to allocate, and there isn't much benefit in paying the allocation cost up front

lassepe · 2024-01-22T14:24:33Z

I tested this on pre-1.0 ProtoBuf which relied heavily on Dict as a backing data structure (ref JuliaIO/ProtoBuf.jl#179)

Below are the benchmark results obtained from the MWE of JuliaIO/ProtoBuf.jl#179. TL;DR: A solid drop in allocations, a minor drop in total time.

Nightly:

This PR:

oscardssmith · 2024-01-22T21:29:38Z

Given the testing, I think this is ready to merge.

oscardssmith · 2024-01-22T22:22:18Z

Another reason why this is very important :):

python3 -m timeit "{}"
20000000 loops, best of 5: 12.8 nsec per loop

Ref: https://reviews.llvm.org/D33656. --------- Co-authored-by: Keno Fischer <[email protected]>

oscardssmith added the performance Must go faster label Jan 15, 2024

Make empty Dict only have space for 0 rather than 16 elements.

bdfdbb9

This improves speed for small Dicts. (the 16 before made sense when Dict was Vector backed since Vectors were slow to instantiate, but size Memory are much faster and have an extra fast path for size 0)

oscardssmith force-pushed the os/empty-make-Dict-smaller branch from 26e7858 to bdfdbb9 Compare January 21, 2024 17:31

oscardssmith changed the title ~~Make empty Dict only have space for 4 rather than 16 elements.~~ Make empty Dict only have space for 0 rather than 16 elements (and the minimum size for non-empty Dict down to 4). Jan 21, 2024

oscardssmith requested a review from vtjnash January 21, 2024 17:33

remove trailing whitespace

8e1c686

oscardssmith added the needs nanosoldier run This PR should have benchmarks run on it label Jan 21, 2024

oscardssmith removed the needs nanosoldier run This PR should have benchmarks run on it label Jan 23, 2024

oscardssmith merged commit d741c24 into JuliaLang:master Jan 23, 2024
9 of 10 checks passed

oscardssmith deleted the os/empty-make-Dict-smaller branch January 23, 2024 14:57

vtjnash referenced this pull request Jan 27, 2024

fix cache line size of PPC7+ (#53044)

0eec031

Ref: https://reviews.llvm.org/D33656. --------- Co-authored-by: Keno Fischer <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make empty `Dict` only have space for 0 rather than 16 elements (and the minimum size for non-empty `Dict` down to 4). #52912

Make empty `Dict` only have space for 0 rather than 16 elements (and the minimum size for non-empty `Dict` down to 4). #52912

oscardssmith commented Jan 15, 2024 •

edited

Loading

vtjnash commented Jan 16, 2024

oscardssmith commented Jan 16, 2024

vtjnash commented Jan 16, 2024 •

edited

Loading

grandinj commented Jan 16, 2024

oscardssmith commented Jan 16, 2024

oscardssmith commented Jan 21, 2024

vtjnash commented Jan 21, 2024

nanosoldier commented Jan 21, 2024

oscardssmith commented Jan 21, 2024

vtjnash commented Jan 21, 2024

oscardssmith commented Jan 21, 2024

PallHaraldsson commented Jan 22, 2024 •

edited

Loading

jakobnissen commented Jan 22, 2024 •

edited

Loading

gbaraldi commented Jan 22, 2024

lassepe commented Jan 22, 2024 •

edited

Loading

oscardssmith commented Jan 22, 2024

oscardssmith commented Jan 22, 2024

Make empty Dict only have space for 0 rather than 16 elements (and the minimum size for non-empty Dict down to 4). #52912

Make empty Dict only have space for 0 rather than 16 elements (and the minimum size for non-empty Dict down to 4). #52912

Conversation

oscardssmith commented Jan 15, 2024 • edited Loading

vtjnash commented Jan 16, 2024

oscardssmith commented Jan 16, 2024

vtjnash commented Jan 16, 2024 • edited Loading

grandinj commented Jan 16, 2024

oscardssmith commented Jan 16, 2024

oscardssmith commented Jan 21, 2024

vtjnash commented Jan 21, 2024

nanosoldier commented Jan 21, 2024

oscardssmith commented Jan 21, 2024

vtjnash commented Jan 21, 2024

oscardssmith commented Jan 21, 2024

PallHaraldsson commented Jan 22, 2024 • edited Loading

jakobnissen commented Jan 22, 2024 • edited Loading

gbaraldi commented Jan 22, 2024

lassepe commented Jan 22, 2024 • edited Loading

oscardssmith commented Jan 22, 2024

oscardssmith commented Jan 22, 2024

Make empty `Dict` only have space for 0 rather than 16 elements (and the minimum size for non-empty `Dict` down to 4). #52912

Make empty `Dict` only have space for 0 rather than 16 elements (and the minimum size for non-empty `Dict` down to 4). #52912

oscardssmith commented Jan 15, 2024 •

edited

Loading

vtjnash commented Jan 16, 2024 •

edited

Loading

PallHaraldsson commented Jan 22, 2024 •

edited

Loading

jakobnissen commented Jan 22, 2024 •

edited

Loading

lassepe commented Jan 22, 2024 •

edited

Loading