-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enahncement of multibamsummary performance #1138
Comments
I've just finished reading about Dask, At first, I thought that one of them could speed up the step of writing data to disk. According to this benchmark, I was wrong. Probably the need to change NumPy is related then to other functions (matrix operations, algebra, etc) that should be parallelized. Of the options evaluated, Dask seems promising, but it would entail rewriting more modules because some currently used NumPy functionalities used in deeptools are out of their scope. Seems to me that So,
If greater data is need for the tests, this can be sorted out with git-lfs. |
It is slow when the numebr of bam files increases. (It could potentially be affected by the high depth of sequencing as well.)
The text was updated successfully, but these errors were encountered: