Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some thoughts about "chunksize" in iter_parallel_chains function of beat/sampler/base.py #86

Closed
ranneylxr opened this issue Aug 15, 2021 · 3 comments

Comments

@ranneylxr
Copy link

Hi again,
In iter_parallel_chains function of beat/sampler/base.py:476-482

        if chunksize is None:
            if draws < 10:
                chunksize = int(np.ceil(float(n_chains) / n_jobs))
            elif draws > 10 and tps < 0.5:
                chunksize = int(np.ceil(float(n_chains) / n_jobs))
            else:
                chunksize = n_jobs

the tps seems to depend on hardware(I have installed libamdm), and if we set a bigger n_jobs, the chunksize will also be bigger when case tps > 0.5 and draws > 10 and stage > 0.

Refering https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map, the bigger chunksize leads to the smaller chunks count. when n_job > chunks count, the bigger n_job will decrease the number of parallels, which means the calculation time gets longer.

Is it correct? And can I set a arbitory chunksize in script manually?
Thank you!

@hvasbath
Copy link
Owner

Hi again,

cool that you are still around ;) .
You are right. The intention behind that is, if your forward model takes a long time, you want to rather use a small chunksize, i.e. having the work distributed in smaller chunks to more workers, otherwise it often happens you have a single worker left with a big chunk of work, that all the other workers are waiting for to be finished until entering the next stage.
Vice versa if you have a fast forward modell you want to have a big chunk-size, because initialising the worker then takes longer than the sampling itself.
Is that understandable? Now I couldnt completely understand what your problem with that setup is. For now you cannot define chunksize in the config file, but if it would help you- we can surely add that- it is not a big deal.

Cheers!

@ranneylxr
Copy link
Author

I understand it!
Thank you for explaining.

Best regards.

@hvasbath
Copy link
Owner

Sorry for the late fixing, but I apparently didnt get the point correctly until I tried myself with larger number of chains.
It is fixed in the current dev branch here: #121 and should be released to master soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants