Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attention-heads as samples from posterior distribution in a Bayesian sense #22

Open
sgbaird opened this issue Feb 5, 2022 · 0 comments

Comments

@sgbaird
Copy link
Collaborator

sgbaird commented Feb 5, 2022

https://aclanthology.org/2020.emnlp-main.17.pdf

Though I think CrabNet might need to be refitted for new samples (i.e. if you specify N=10, then you only get 10 samples from the posterior, to get more would probably require refitting, and not sure if these would be directly comparable to the 10 from the first run). Also not exactly sure how this could be converted to individual predictions. Maybe just some basic plumbing in and after:

CrabNet/crabnet/kingcrab.py

Lines 151 to 157 in 9e0d79c

if self.attention:
encoder_layer = nn.TransformerEncoderLayer(self.d_model,
nhead=self.heads,
dim_feedforward=2048,
dropout=0.1)
self.transformer_encoder = nn.TransformerEncoder(encoder_layer,
num_layers=self.N)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant