Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create random numbers in bigger chunks #72

Merged
merged 1 commit into from
Mar 22, 2017
Merged

Conversation

denisalevi
Copy link
Member

Before we created them every clock cycle which creates significant
overhead. This implementation still needs to call one
cudaMemcpyToSymbol per clock cycle and codeobject using rand/randn,
which could be avoided.

This implementation very generically generates a max of 50MB of random number per codeobject and regenerates them after they are used up. This way for mall simulations we only generate once and for bigger simulations (where memory is a potential limit), we don't generate too many numbers.

This is a quick and dirty solution here which might fail if we run into memory limits. but for our current benchmarks its good enough and since we should probably use a cleaner buffer system at some point, I will leave it like this for now.

Before we created them every clock cycle which creates significant
overhead. This implementation still needs to call one
`cudaMemcpyToSymbol` per clock cycle and codeobject using rand/randn,
which could be avoided.
@denisalevi denisalevi merged commit 7dd0dec into master Mar 22, 2017
@denisalevi denisalevi deleted the remove_rng_overhead branch March 22, 2017 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant