-
Notifications
You must be signed in to change notification settings - Fork 976
Pull requests: EleutherAI/gpt-neox
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
add sparse attention support, with ability to specify at which layers…
#16
by lucidrains
was merged Dec 27, 2020
Loading…
add tfrecords dataset and make minor changes to configs/train script
#17
by sdtblck
was merged Dec 28, 2020
Loading…
add ability to use fused layer norm with use_fused_layernorm=True flag
#18
by lucidrains
was merged Dec 28, 2020
Loading…
fix small bug where sequence length is not passed into attention class
#21
by lucidrains
was merged Jan 1, 2021
Loading…
fix small bug where sequence length is not passed into attention clas…
#23
by StellaAthena
was merged Jan 1, 2021
Loading…
add linear warmup over 5000 steps and gradient clipping
#29
by lucidrains
was merged Jan 4, 2021
Loading…
Add enron_jsonl and enron_tfr datasets (mostly for testing)
#56
by sdtblck
was merged Jan 13, 2021
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.