Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed regression #245

Open
g5t opened this issue Oct 22, 2024 · 4 comments
Open

Speed regression #245

g5t opened this issue Oct 22, 2024 · 4 comments

Comments

@g5t
Copy link
Contributor

g5t commented Oct 22, 2024

As part of essspectroscopy I noticed a dramatic runtime increase for the bifrost workflow after upgrading a dependent package, where uv also upgraded scippnexus from 24.8.0 to 24.10.0.

I can run part of the workflow under cProfile to work out where the extra time is being spent, e.g.,

import cProfile
import pstats
from pstats import SortKey

cProfile.run('data = bifrost(POOCH_DATA_DIR / datafile, is_simulated=True)', 'cstats')
p = pstats.Stats('cstats')

p.sort_stats(SortKey.CUMULATIVE).print_stats(10)

The change occurs with the latest release, and persists at the current reference on main.

scippnexus time per setting
24.8.0 1.76 s
24.8.1 1.80 s
24.9.0 1.82 s
24.9.1 1.82 s
24.10.0 11.22 s

An example output of the profile recorded as above for 24.10.0 is

Tue Oct 22 22:41:43 2024    cstats

         46916467 function calls (46646727 primitive calls) in 39.160 seconds

   Ordered by: cumulative time
   List reduced from 1198 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25748    0.019    0.000   26.411    0.001 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib/python3.12/site-packages/scippnexus/base.py:251(_children)
    19658    0.939    0.000   26.390    0.001 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib/python3.12/site-packages/scippnexus/base.py:258(_read_children)
  1152334    0.461    0.000   23.813    0.000 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib64/python3.12/site-packages/h5py/_hl/base.py:431(__iter__)
  1132676    0.719    0.000   21.838    0.000 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib64/python3.12/site-packages/h5py/_hl/group.py:372(get)
  1133577    8.618    0.000   20.983    0.000 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib64/python3.12/site-packages/h5py/_hl/group.py:348(__getitem__)
  1137887    1.140    0.000    4.550    0.000 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib64/python3.12/site-packages/h5py/_hl/group.py:34(__init__)
  2439862    2.945    0.000    3.431    0.000 <frozen importlib._bootstrap>:645(parent)
43779/29563    0.048    0.000    3.100    0.000 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib/python3.12/site-packages/scippnexus/_cache.py:32(__get__)
  1180946    2.832    0.000    2.836    0.000 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib64/python3.12/site-packages/h5py/_hl/base.py:336(__init__)
       43    0.000    0.000    2.715    0.063 /home/g/Projects/20240914/bifrost-workflow-2409/.venv/lib/python3.12/site-packages/scippnexus/file.py:60(__exit__)

Given that the changes in 24.10.0 appear aimed at improved transformation chain loading, is this an indication of a problem with the file I'm loading or with scippnexus?

@SimonHeybrock
Copy link
Member

SimonHeybrock commented Oct 23, 2024

Yes, this is somewhat expected, and this is related to what I asked you about recently (see private chat from September 27th). Having many components and chains with multiple steps is the likely cause.

I will look again into adding a cache.

Can you share the file you used for this?

@g5t
Copy link
Contributor Author

g5t commented Oct 23, 2024

The workflows for BIFROST are highly-irregular at the moment, so I wouldn't worry about adding a cache just yet.
Instead I'd imagine it should be possible to limit how often bifrost opens the file (if your previous estimate of a 5-second penalty was right, it apparently opens the same file twice per setting).

@SimonHeybrock
Copy link
Member

Yes. And we should probably first refactor ESSspectroscopy to use GenericNeXusWorkflow before making any decisions here.

@jl-wynen
Copy link
Member

This is in progress on this branch: https://github.com/scipp/essspectroscopy/tree/nexus-workflow Feel free to contribute if you don't want to wait for me to have time to continue the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants