Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize array allocation. #2355

Merged
merged 2 commits into from
Apr 29, 2024
Merged

Optimize array allocation. #2355

merged 2 commits into from
Apr 29, 2024

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Apr 29, 2024

Before:

julia> @benchmark CuArray{Float32}(undef, 1)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.226 μs …  8.022 ms  ┊ GC (min … max): 0.00% … 11.43%
 Time  (median):     1.316 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.129 μs ± 80.204 μs  ┊ GC (mean ± σ):  4.31% ±  0.11%

           ▂▄▅▅▇██▇▇▆▃▅▃▄▃▃▁▁▁▁
  ▂▁▂▂▂▃▄▅▇█████████████████████▇▆▅▅▄▄▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▅
  1.23 μs        Histogram: frequency by time        1.51 μs <

 Memory estimate: 464 bytes, allocs estimate: 10.

After:

julia> @benchmark CuArray{Float32}(undef, 1)
BenchmarkTools.Trial: 10000 samples with 119 evaluations.
 Range (min … max):  758.815 ns …  2.191 ms  ┊ GC (min … max): 0.00% … 15.43%
 Time  (median):     824.029 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):     1.545 μs ± 35.931 μs  ┊ GC (mean ± σ):  5.17% ±  0.22%

            ▂▅▆▆▆▆▆▇▆█▆▇▅▄▁▁
  ▁▁▁▁▂▃▄▅▇█████████████████▇▆▆▅▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁ ▄
  759 ns          Histogram: frequency by time          965 ns <

 Memory estimate: 208 bytes, allocs estimate: 5.

This had regressed in recent PRs compared to 5.3:

julia> @benchmark CuArray{Float32}(undef, 1)
BenchmarkTools.Trial: 10000 samples with 102 evaluations.
 Range (min … max):  774.598 ns …  1.552 ms  ┊ GC (min … max): 0.00% … 8.63%
 Time  (median):     881.559 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):     1.911 μs ± 37.185 μs  ┊ GC (mean ± σ):  4.72% ± 0.25%

           ▁▁▁▁▃▃▃▁▁ ▁▁▂▃▄▆▇████▅▅▂
  ▁▂▃▂▂▂▃▅██████████████████████████▆▅▄▃▃▂▃▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▂▂▂▂ ▄
  775 ns          Histogram: frequency by time         1.03 μs <

 Memory estimate: 320 bytes, allocs estimate: 11.

So we're back to the original performance now, while allocating significantly less.

@maleadt maleadt added enhancement New feature or request performance How fast can we go? labels Apr 29, 2024
@maleadt maleadt merged commit 750e2d3 into master Apr 29, 2024
1 check was pending
@maleadt maleadt deleted the tb/optimize_alloc branch April 29, 2024 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance How fast can we go?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant