Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix small bug where sequence length is not passed into attention class #21

Merged
merged 3 commits into from
Jan 1, 2021

Conversation

lucidrains
Copy link
Contributor

No description provided.

@lucidrains lucidrains requested a review from a team as a code owner January 1, 2021 05:26
Copy link
Member

@StellaAthena StellaAthena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this code, and it wasn't enough to get the code running on the server. See #22

@StellaAthena StellaAthena self-requested a review January 1, 2021 16:22
Copy link
Member

@StellaAthena StellaAthena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows gpt3small to run, but does not fix the problems with sparse attention (see #22)

@StellaAthena StellaAthena merged commit 7043aac into main Jan 1, 2021
@StellaAthena StellaAthena deleted the pw/fix-seq-len branch January 1, 2021 16:24
StellaAthena added a commit that referenced this pull request Jan 1, 2021
#21) (#23)

* fix small bug where sequence length is not passed into attention class

* fix bug with mask and half values, as well as masking in dense attention

* make sure install deepspeed with pip sudo

This allows `gpt3small` to run but does not fix the problems with sparse attention. See #22

Co-authored-by: Phil Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants