attention-heads as samples from posterior distribution in a Bayesian sense #22

sgbaird · 2022-02-05T17:29:14Z

https://aclanthology.org/2020.emnlp-main.17.pdf

Though I think CrabNet might need to be refitted for new samples (i.e. if you specify N=10, then you only get 10 samples from the posterior, to get more would probably require refitting, and not sure if these would be directly comparable to the 10 from the first run). Also not exactly sure how this could be converted to individual predictions. Maybe just some basic plumbing in and after:

CrabNet/crabnet/kingcrab.py

Lines 151 to 157 in 9e0d79c

    
           if self.attention: 
        
               encoder_layer = nn.TransformerEncoderLayer(self.d_model, 
        
                                                          nhead=self.heads, 
        
                                                          dim_feedforward=2048, 
        
                                                          dropout=0.1) 
        
               self.transformer_encoder = nn.TransformerEncoder(encoder_layer, 
        
                                                                num_layers=self.N)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attention-heads as samples from posterior distribution in a Bayesian sense #22

attention-heads as samples from posterior distribution in a Bayesian sense #22

sgbaird commented Feb 5, 2022 •

edited

Loading

attention-heads as samples from posterior distribution in a Bayesian sense #22

attention-heads as samples from posterior distribution in a Bayesian sense #22

Comments

sgbaird commented Feb 5, 2022 • edited Loading

sgbaird commented Feb 5, 2022 •

edited

Loading