Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix small bug where sequence length is not passed into attention clas… #23

Merged
merged 1 commit into from
Jan 1, 2021

Conversation

StellaAthena
Copy link
Member

…s (#21)

  • fix small bug where sequence length is not passed into attention class

  • fix bug with mask and half values, as well as masking in dense attention

  • make sure install deepspeed with pip sudo

This allows gpt3small to run but does not fix the problems with sparse attention. See #22

#21)

* fix small bug where sequence length is not passed into attention class

* fix bug with mask and half values, as well as masking in dense attention

* make sure install deepspeed with pip sudo

This allows `gpt3small` to run but does not fix the problems with sparse attention. See #22
@StellaAthena StellaAthena requested a review from a team as a code owner January 1, 2021 16:33
@StellaAthena StellaAthena requested review from lucidrains and ConnorJL and removed request for a team January 1, 2021 16:33
@StellaAthena StellaAthena merged commit 3c7a44a into dev-model-parallel Jan 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants