Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix small bug where sequence length is not passed into attention class (
#21) (#23) * fix small bug where sequence length is not passed into attention class * fix bug with mask and half values, as well as masking in dense attention * make sure install deepspeed with pip sudo This allows `gpt3small` to run but does not fix the problems with sparse attention. See #22 Co-authored-by: Phil Wang <[email protected]>
- Loading branch information