-
Notifications
You must be signed in to change notification settings - Fork 982
Pull requests: EleutherAI/gpt-neox
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
cleanup deepspeed training scripts
#11
by lucidrains
was merged Dec 26, 2020
Loading…
updated Dec 26, 2020
remove enwik8 data from repository
#13
by lucidrains
was merged Dec 26, 2020
Loading…
updated Dec 26, 2020
Add params & remove gpu_monitor
#14
by sdtblck
was merged Dec 27, 2020
Loading…
updated Dec 27, 2020
add sparse attention support, with ability to specify at which layers…
#16
by lucidrains
was merged Dec 27, 2020
Loading…
updated Dec 27, 2020
add tfrecords dataset and make minor changes to configs/train script
#17
by sdtblck
was merged Dec 28, 2020
Loading…
updated Dec 28, 2020
add ability to use fused layer norm with use_fused_layernorm=True flag
#18
by lucidrains
was merged Dec 28, 2020
Loading…
updated Dec 28, 2020
fix small bug where sequence length is not passed into attention class
#21
by lucidrains
was merged Jan 1, 2021
Loading…
updated Jan 1, 2021
fix small bug where sequence length is not passed into attention clas…
#23
by StellaAthena
was merged Jan 1, 2021
Loading…
updated Jan 1, 2021
add linear warmup over 5000 steps and gradient clipping
#29
by lucidrains
was merged Jan 4, 2021
Loading…
updated Jan 4, 2021
make mask value smaller by factor of 2
#25
by lucidrains
was merged Jan 4, 2021
Loading…
updated Jan 4, 2021
disable reduce for loss calculation and calculate mean separately
#31
by anthony-dipofi
was merged Jan 4, 2021
Loading…
updated Jan 4, 2021
Fix error in extracting OWT2 dataset
#35
by steven-mi
was merged Jan 4, 2021
Loading…
updated Jan 4, 2021
untie classifier weights by default
#30
by lucidrains
was merged Jan 4, 2021
Loading…
updated Jan 4, 2021
Add improved data downloading class / pipeline
#39
by sdtblck
was merged Jan 5, 2021
Loading…
updated Jan 5, 2021
Previous Next
ProTip!
Adding no:label will show everything without a label.