-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for joblib not saturating CPU during multiprocessing #1188
Fix for joblib not saturating CPU during multiprocessing #1188
Conversation
…ficient_simulator
…mulator. The wrapping/casting currently increases the runtime roughly three times, but the code cannot be breaking for now.
…ovic/sbi into 1175-simulation-multiproc-fix
…ests (pytest -n auto -m "not slow\nand not gpu")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thorough PR!
I made some comments to refactor simulate_for_sbi
. At the moment, the function is a bit cluttered with several if-else
cases and I think we can get rid of some of them.
Otherwise, I suggest that you use the files in benchmarks/
just for testing as part of this PR and remove them before merging. Alternatively, you could try to condense them into one concise test and add it to tests/multiprocessing_test.py
or so.
One question: is there a specific reason why you add the process_simulator
etc to many of the test cases?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1188 +/- ##
==========================================
- Coverage 84.55% 75.56% -9.00%
==========================================
Files 96 96
Lines 7603 7629 +26
==========================================
- Hits 6429 5765 -664
- Misses 1174 1864 +690
Flags with carried forward coverage won't be shown. Click here to find out more.
|
…in `simulate_for_sbi`, improved comments, removed `benchmark` folder
Absolutely, now the
where
theta = proposal.sample((num_simulations,)).numpy()
def joblib_simulator(theta: ndarray) -> Tensor:
return torch.as_tensor(simulator(torch.as_tensor(theta)), dtype=float32) Wrapping the simulator with As stated in |
Thanks for the detailed explanation. Overall, this makes sense. But I believe that in the test cases, we do not really have to use theta = prior.sample((num_simulations,))
x = simulator(theta) I know that in some test functions this is not done. But in my view, Again, please excuse if my previous comments were not clear and have caused extra work for you. |
…e_on_device_test.py sbi-dev#1188 Co-authored-by: Jan <[email protected]>
…ovic/sbi into 1175-simulation-multiproc-fix
No problem at all, thank you for following up on the matter! So, I was refactoring the tests when I realized that Given that the current tests' implementation with |
Yes, makes sense to keep it then. Thanks!
Am 19. Juli 2024, 13:57 +0200 schrieb Pizza GitHub ***@***.***>:
… > Thanks for the detailed explanation. Overall, this makes sense. But I believe that in the test cases, we do not really have to use simulate_for_sbi because we mostly use uniform or Gaussian priors and vectorized Gaussian simulators. Thus, we can just call them directly:
> theta = prior.sample((num_simulations,))
> x = simulator(theta)
> I know that in some test functions this is not done. But in my view, simulate_for_sbi is really a convenience function for the users that we do not need to use internally. We also do not really need multiprocessing during testing because the toy simulators are basically instant in speed. Or I am missing something here?
> Again, please excuse if my previous comments were not clear and have caused extra work for you.
No problem at all, thank you for following up on the matter! So, I was refactoring the tests when I realized that simulate_for_sbi allows to control the seed in a very handy manner. Without that some of the tests fail due to unsatisfactory performance (e.g. test_c2st_snl_on_linear_gaussian_different_dims).
Given that the current tests' implementation with simulate_for_sbi does not really have strong downsides, should I still proceed with the refactoring?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! You only need to run pre-commit run --all-files
to get rid of unused imports I think. Then it will probably be good to go! 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now. Thanks a lot for the effort that went into analyzing and fixing this! Great work 👏
Will merge once the testing PR is merged and all tests are passing.
Thanks to you for the assistance and the great feeback! :) |
Actually, one thing that I just realized: |
This implementation effectively bypasses |
OK, I understand. Thus, for this PR we will keep it. But in a future PR, we might well just "refactor-away" |
Actually, I think I didn't get what you meant. On this branch, |
Yes, but there are other methods still using it, e.g., in |
What does this implement/fix? Explain your changes
This PR refactors the
simulate_for_sbi
method so that now:simulate_in_batches
simulate_for_sbi
not saturating CPU #1175...
Does this close any currently open issues?
Fixes #1175
Any relevant code examples, logs, error output, etc?
Refer to the thread in #1175
...
Any other comments?
Further refactoring will be needed in order to avoid the
process_simulator
wrapping (possibly reducing the simulation runtime by roughly another 3 times)...
Checklist
Put an
x
in the boxes that apply. You can also fill these out after creatingthe PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.
guidelines
with
pytest.mark.slow
.guidelines
main
(or there are no conflicts withmain
)