Skip to content

Pull requests: EleutherAI/gpt-neox

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

working minimal GPT
#2 by lucidrains was merged Dec 23, 2020 Loading… updated Dec 23, 2020
PR for Deepspeed Integration
#9 by trisongz was merged Dec 24, 2020 Loading… updated Dec 24, 2020
Create CODEOWNERS
#8 by StellaAthena was merged Dec 24, 2020 Loading… updated Dec 24, 2020
get rid of test file
#10 by sdtblck was merged Dec 26, 2020 Loading… updated Dec 26, 2020
cleanup deepspeed training scripts
#11 by lucidrains was merged Dec 26, 2020 Loading… updated Dec 26, 2020
Create train.sh
#12 by sdtblck was merged Dec 26, 2020 Loading… updated Dec 26, 2020
test
#1 by lucidrains was merged Dec 22, 2020 Loading… updated Dec 26, 2020
remove enwik8 data from repository
#13 by lucidrains was merged Dec 26, 2020 Loading… updated Dec 26, 2020
Add params & remove gpu_monitor
#14 by sdtblck was merged Dec 27, 2020 Loading… updated Dec 27, 2020
add tensorboard logging
#15 by sdtblck was merged Dec 27, 2020 Loading… updated Dec 27, 2020
add sparse attention support, with ability to specify at which layers…
#16 by lucidrains was merged Dec 27, 2020 Loading… updated Dec 27, 2020
add tfrecords dataset and make minor changes to configs/train script
#17 by sdtblck was merged Dec 28, 2020 Loading… updated Dec 28, 2020
add ability to use fused layer norm with use_fused_layernorm=True flag
#18 by lucidrains was merged Dec 28, 2020 Loading… updated Dec 28, 2020
fix small bug where sequence length is not passed into attention class
#21 by lucidrains was merged Jan 1, 2021 Loading… updated Jan 1, 2021
fix small bug where sequence length is not passed into attention clas…
#23 by StellaAthena was merged Jan 1, 2021 Loading… updated Jan 1, 2021
GPT-3 Small Works
#24 by StellaAthena was merged Jan 3, 2021 Loading… updated Jan 3, 2021
add linear warmup over 5000 steps and gradient clipping
#29 by lucidrains was merged Jan 4, 2021 Loading… updated Jan 4, 2021
make mask value smaller by factor of 2
#25 by lucidrains was merged Jan 4, 2021 Loading… updated Jan 4, 2021
disable reduce for loss calculation and calculate mean separately
#31 by anthony-dipofi was merged Jan 4, 2021 Loading… updated Jan 4, 2021
Automatically download owt2
#33 by steven-mi was merged Jan 4, 2021 Loading… updated Jan 4, 2021
Fix error in extracting OWT2 dataset
#35 by steven-mi was merged Jan 4, 2021 Loading… updated Jan 4, 2021
Update requirements.txt
#36 by sdtblck was merged Jan 4, 2021 Loading… updated Jan 4, 2021
untie classifier weights by default
#30 by lucidrains was merged Jan 4, 2021 Loading… updated Jan 4, 2021
Fix deprecation warning
#42 by sdtblck was merged Jan 5, 2021 Loading… updated Jan 5, 2021
Add improved data downloading class / pipeline
#39 by sdtblck was merged Jan 5, 2021 Loading… updated Jan 5, 2021
3
2
ProTip! Adding no:label will show everything without a label.