Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiler improvements: (textual) time distribution, at-bprofile. #2162

Merged
merged 6 commits into from
Nov 10, 2023

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Nov 9, 2023

julia> x = @bprofile time=0.01 CuArray([1]).+1
Profiler ran for 10.0 ms, capturing 13651 events.

Host-side activity: calling CUDA APIs took 7.28 ms (72.74% of the trace)
┌──────────┬────────────┬───────┬──────────────────────────────────────┬─────────────────────────┐
│ Time (%) │ Total time │ Calls │ Time distribution                    │ Name                    │
├──────────┼────────────┼───────┼──────────────────────────────────────┼─────────────────────────┤
│   18.28% │    1.83 ms │   525 │   3.48 µs ± 0.62   (   3.1 ‥ 15.02)  │ cuLaunchKernel          │
│   16.11% │    1.61 ms │   525 │   3.07 µs ± 0.22   (  1.43 ‥ 3.81)   │ cuCtxSynchronize        │
│   16.03% │     1.6 ms │   525 │   3.05 µs ± 8.31   (  2.15 ‥ 192.88) │ cuMemcpyHtoDAsync       │
│   12.53% │    1.25 ms │  1050 │   1.19 µs ± 0.44   (  0.72 ‥ 8.34)   │ cuMemAllocFromPoolAsync │
│    1.58% │  158.07 µs │   525 │ 301.09 ns ± 119.54 (238.42 ‥ 715.26) │ cuStreamSynchronize     │
└──────────┴────────────┴───────┴──────────────────────────────────────┴─────────────────────────┘

Not as nice as BenchmarkTool's histograms, but we can't fit those in a table.
This is as much information as I could neatly pack into a single cell.

@bprofile is pretty naive as I didn't want to depend on BenchmarkTools for better logic, but it's a good start.

@maleadt maleadt added enhancement New feature or request cuda kernels Stuff about writing CUDA kernels. labels Nov 9, 2023
Copy link

codecov bot commented Nov 10, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (853baef) 72.37% compared to head (2e3f433) 72.63%.
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2162      +/-   ##
==========================================
+ Coverage   72.37%   72.63%   +0.26%     
==========================================
  Files         159      159              
  Lines       14535    14592      +57     
==========================================
+ Hits        10519    10599      +80     
+ Misses       4016     3993      -23     
Files Coverage Δ
src/profile.jl 83.43% <92.30%> (+7.48%) ⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maleadt maleadt merged commit 487f725 into master Nov 10, 2023
1 check passed
@maleadt maleadt deleted the tb/profile branch November 10, 2023 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant