You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I am working with long sequences, and I often encounter limitations when handling these sequences effectively due to high GPU memory consumption. This issue is particularly challenging when using models with full attention mechanisms, which scale quadratically with sequence length.
Describe the solution you'd like
I would like to request the development of a pre-trained model that incorporates sparse or BlockSparse attention mechanisms. These mechanisms are designed to handle long sequences more efficiently, without consuming excessive GPU memory. The model should be able to process long input sequences while maintaining performance comparable to existing models that use full attention.
Describe alternatives you've considered
Truncating or splitting long sequences: This approach can be used to fit sequences within memory constraints, but it may result in loss of contextual information and reduced model performance.
Sliding window approach: This method involves processing long sequences in smaller, overlapping segments. However, it still does not fully address the problem of capturing long-range dependencies in the data.
Additional context
I am mainly wondering if such a model will be pushed to Hugging Face's model repository, which would make it more accessible and easier to use for the broader community.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I am working with long sequences, and I often encounter limitations when handling these sequences effectively due to high GPU memory consumption. This issue is particularly challenging when using models with full attention mechanisms, which scale quadratically with sequence length.
Describe the solution you'd like
I would like to request the development of a pre-trained model that incorporates sparse or BlockSparse attention mechanisms. These mechanisms are designed to handle long sequences more efficiently, without consuming excessive GPU memory. The model should be able to process long input sequences while maintaining performance comparable to existing models that use full attention.
Describe alternatives you've considered
Additional context
I am mainly wondering if such a model will be pushed to Hugging Face's model repository, which would make it more accessible and easier to use for the broader community.
The text was updated successfully, but these errors were encountered: