Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch with different rna length #15

Open
giorgiobini opened this issue Sep 20, 2022 · 2 comments
Open

Batch with different rna length #15

giorgiobini opened this issue Sep 20, 2022 · 2 comments

Comments

@giorgiobini
Copy link

Hello,

I am wondering if you have any function to pad batches with different size.

Thank you so much in advance!

@sperfu
Copy link
Contributor

sperfu commented Sep 21, 2022

Hi there,

Sorry for that, since our framework could deal with sequence with various length, so to avoid out-of-memory issue, we have limited the batch size and set it to a fixed number. Our training model uses batch size of 1 to deal with all the data. So currently we do not support function to pad batches with different sizes.

Thanks.

@sperfu
Copy link
Contributor

sperfu commented Sep 21, 2022

Hi,
Regarding to your question on padding batches with different size, I'm afraid we don't have that function. The reason is that different sequence have different length(ranging from 10bp to over a thousand bp). If we pad sequence into the same length, it will inevitably bring useless information, which would deteriorate the performance. So we choose the model batch size 1 with one sequence per input to avoid padding sequence.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants