You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nowadays Transformer is applied everywhere, and we people have many variants of the original proposed self-attention. The overall main structure could remain (the multi-layer stacked encoder, the multi-head attention layer, and the feed-forward layer, etc.), while the attention operator often changes, e.g. ProbAttention from Informer. By making Transformer layers more flexible, we can construct more complex models with current modules in PyPOTS.
3. Your contribution
Will make a PR to achieve this goal.
The text was updated successfully, but these errors were encountered:
1. Feature description
Make ScaledDotProductAttention in MultiHeadAttention could be replaced by other attention operators.
PyPOTS/pypots/nn/modules/transformer/attention.py
Line 138 in b5d5d1c
2. Motivation
Nowadays Transformer is applied everywhere, and we people have many variants of the original proposed self-attention. The overall main structure could remain (the multi-layer stacked encoder, the multi-head attention layer, and the feed-forward layer, etc.), while the attention operator often changes, e.g. ProbAttention from Informer. By making Transformer layers more flexible, we can construct more complex models with current modules in PyPOTS.
3. Your contribution
Will make a PR to achieve this goal.
The text was updated successfully, but these errors were encountered: