Skip to content

Pull requests: EleutherAI/gpt-neox

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

SLURM Multi-Node Support
#680 by dashstander was merged Sep 22, 2022 Loading… updated Sep 22, 2022
Slurm Fix and formatting
#729 by dashstander was merged Dec 6, 2022 Loading… updated Dec 6, 2022
Save checkpoint metadata when using SlurmRunner
#733 by dashstander was merged Dec 8, 2022 Loading… updated Dec 8, 2022
Adds the proper DeepSpeed bf16 config
#786 by dashstander was closed Feb 14, 2023 Loading… updated Feb 14, 2023
Implement DeepSpeed Main autotuning for NeoX
#739 by dashstander was merged Mar 9, 2023 Loading… updated Mar 9, 2023 Release V2
ALibi & Flash Attention
#864 by dashstander was merged Apr 11, 2023 Loading… updated Apr 11, 2023
Add DeepSpeed bf16 configuration
#787 by dashstander was merged May 16, 2023 Loading… updated May 16, 2023
Updates bf16 demo config and mixed precision docutmentation.
#941 by dashstander was merged May 18, 2023 Loading… updated May 18, 2023
Return to broadcasting megatron and deepspeed configs as training.py arguments
#948 by dashstander was merged May 23, 2023 Loading… updated May 23, 2023
Bump transformers version and update enwik8 link
#1024 by dashstander was merged Sep 13, 2023 Loading… updated Sep 13, 2023
Fix bf16 for zero > 0 and pipeline parallelism > 0
#1032 by dashstander was merged Sep 18, 2023 Loading… updated Sep 18, 2023
Remove support for lazy dataset implementation
#1033 by dashstander was merged Sep 18, 2023 Loading… updated Sep 18, 2023
Remove the NeoX implementation of GPT2Tokenizer
#1042 by dashstander was merged Sep 25, 2023 Loading… updated Sep 25, 2023
Pre-compute RoPE embeddings in fp32
#1041 by dashstander was merged Sep 25, 2023 Loading… updated Sep 25, 2023
Patch LR Annealing Bug
#1046 by dashstander was merged Sep 27, 2023 Loading… updated Sep 27, 2023
Improve FLOPS Calculation
#1044 by dashstander was merged Sep 27, 2023 Loading… updated Sep 27, 2023
Add section to the README detailing how to start distributed jobs
#1048 by dashstander was merged Sep 29, 2023 Loading… updated Sep 29, 2023
Organize the tools directory
#1055 by dashstander was merged Oct 2, 2023 Loading… updated Oct 2, 2023
Add documentation about using labelled datasets
#1056 by dashstander was merged Oct 4, 2023 Loading… updated Oct 4, 2023
LR scheduler fix no longer breaks inference
#1060 by dashstander was merged Oct 17, 2023 Loading… updated Oct 17, 2023
ProTip! Updated in the last three days: updated:>2024-06-28.