Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix different positional embeddings clashing #147

Merged
merged 14 commits into from
Mar 4, 2021
Merged

Conversation

sdtblck
Copy link
Contributor

@sdtblck sdtblck commented Feb 28, 2021

As mentioned on discord, if using rpe, the regular pos embs were still being added in.

This should fix that - you now specify the type of positional embedding you want as the "--pos-emb" arg and can choose from 'learned', 'sinusoidal' and 'rpe'.

Still needs testing.

@sdtblck sdtblck requested a review from a team as a code owner February 28, 2021 19:23
megatron/model/gpt2_model.py Show resolved Hide resolved
megatron/model/gpt2_model.py Show resolved Hide resolved
@sdtblck
Copy link
Contributor Author

sdtblck commented Mar 2, 2021

We might want to double check our sinusoidal implementation, but all the others appear to work fine.

image

dashed peach line: sinusoidal,
other peach line : learned embedding summed w input,
green : no pos emb
grey : rpe

@StellaAthena
Copy link
Member

Are we going to leave this open until the bug in sinusoidal is fixed or would you rather we merged and opened a bug issue @sdtblck

@sdtblck
Copy link
Contributor Author

sdtblck commented Mar 4, 2021

Are we going to leave this open until the bug in sinusoidal is fixed or would you rather we merged and opened a bug issue @sdtblck

It's unclear whether it's a bug but if it is, it's in the main branch too. And the rpe stuff just flat out doesn't work in the main branch, so i'd suggest we merge this.

Copy link
Member

@StellaAthena StellaAthena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly worlds, however it appears that there’s a bug in the sinusoidal embedding.

@StellaAthena StellaAthena merged commit 619e2cd into main Mar 4, 2021
@StellaAthena StellaAthena deleted the fix-embeddings branch March 4, 2021 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants