Skip to content

Pull requests: EleutherAI/gpt-neox

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

LR scheduler fix no longer breaks inference
#1060 by dashstander was merged Oct 17, 2023 Loading…
Add documentation about using labelled datasets
#1056 by dashstander was merged Oct 4, 2023 Loading…
Organize the tools directory
#1055 by dashstander was merged Oct 2, 2023 Loading…
Patch LR Annealing Bug
#1046 by dashstander was merged Sep 27, 2023 Loading…
Improve FLOPS Calculation
#1044 by dashstander was merged Sep 27, 2023 Loading…
Remove the NeoX implementation of GPT2Tokenizer
#1042 by dashstander was merged Sep 25, 2023 Loading…
Pre-compute RoPE embeddings in fp32
#1041 by dashstander was merged Sep 25, 2023 Loading…
Remove support for lazy dataset implementation
#1033 by dashstander was merged Sep 18, 2023 Loading…
Fix bf16 for zero > 0 and pipeline parallelism > 0
#1032 by dashstander was merged Sep 18, 2023 Loading…
Bump transformers version and update enwik8 link
#1024 by dashstander was merged Sep 13, 2023 Loading…
ALibi & Flash Attention
#864 by dashstander was merged Apr 11, 2023 Loading…
Add DeepSpeed bf16 configuration
#787 by dashstander was merged May 16, 2023 Loading…
Adds the proper DeepSpeed bf16 config
#786 by dashstander was closed Feb 14, 2023 Loading…
Save checkpoint metadata when using SlurmRunner
#733 by dashstander was merged Dec 8, 2022 Loading…
Slurm Fix and formatting
#729 by dashstander was merged Dec 6, 2022 Loading…
SLURM Multi-Node Support
#680 by dashstander was merged Sep 22, 2022 Loading…
ProTip! Updated in the last three days: updated:>2024-07-09.