Skip to content

Pull requests: EleutherAI/gpt-neox

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Adds the proper DeepSpeed bf16 config
#786 by dashstander was closed Feb 14, 2023 Loading…
ALibi & Flash Attention
#864 by dashstander was merged Apr 11, 2023 Loading…
Bump transformers version and update enwik8 link
#1024 by dashstander was merged Sep 13, 2023 Loading…
Fix bf16 for zero > 0 and pipeline parallelism > 0
#1032 by dashstander was merged Sep 18, 2023 Loading…
Remove support for lazy dataset implementation
#1033 by dashstander was merged Sep 18, 2023 Loading…
Remove the NeoX implementation of GPT2Tokenizer
#1042 by dashstander was merged Sep 25, 2023 Loading…
Improve FLOPS Calculation
#1044 by dashstander was merged Sep 27, 2023 Loading…
Patch LR Annealing Bug
#1046 by dashstander was merged Sep 27, 2023 Loading…
Organize the tools directory
#1055 by dashstander was merged Oct 2, 2023 Loading…
Add documentation about using labelled datasets
#1056 by dashstander was merged Oct 4, 2023 Loading…
LR scheduler fix no longer breaks inference
#1060 by dashstander was merged Oct 17, 2023 Loading…
Save checkpoint metadata when using SlurmRunner
#733 by dashstander was merged Dec 8, 2022 Loading…
Pre-compute RoPE embeddings in fp32
#1041 by dashstander was merged Sep 25, 2023 Loading…
SLURM Multi-Node Support
#680 by dashstander was merged Sep 22, 2022 Loading…
Slurm Fix and formatting
#729 by dashstander was merged Dec 6, 2022 Loading…
Add DeepSpeed bf16 configuration
#787 by dashstander was merged May 16, 2023 Loading…
ProTip! Add no:assignee to see everything that’s not assigned.