Skip to content

Pull requests: EleutherAI/gpt-neox

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Torchtyping
#600 opened Mar 23, 2022 by slash-under Draft updated Apr 23, 2023
Add generating batch size
#592 opened Mar 18, 2022 by zhuzilin Loading… updated Apr 23, 2023
Improve our mup implementation
#837 opened Mar 15, 2023 by Quentin-Anthony Draft updated May 2, 2023
2 tasks
MoE Support
#677 opened Sep 19, 2022 by Quentin-Anthony Draft updated May 11, 2023 Release V2
Adds a script to convert NeoX 2.0 checkpoints to DeepSpeed's universal checkpoint format merge-queue This PR is next on the queue to merge
#836 opened Mar 14, 2023 by dashstander Loading… updated Sep 7, 2023
Fixing MuP
#1061 opened Oct 19, 2023 by marcobellagente93 Loading… updated Oct 21, 2023
Implement neox_args processing when OMPI_COMM_WORLD_SIZE>1
#1073 opened Nov 7, 2023 by kyuheejang Loading… updated Nov 26, 2023
Support for DeepSpeed Ulysses (SP)
#1084 opened Nov 26, 2023 by Quentin-Anthony Draft updated Nov 26, 2023
2
Adding AxoNN's 3D tensor parallelism [WIP] feature request New feature or request
#1086 opened Nov 28, 2023 by siddharth9820 Draft updated Jan 18, 2024
1 of 3 tasks
Add DS inference
#1130 opened Jan 25, 2024 by yang Draft updated Jan 26, 2024
Remove unused requirements-sparseattention
#1177 opened Mar 8, 2024 by segyges Draft updated Mar 8, 2024
Adding replay into GPT-NeoX
#1200 opened Apr 13, 2024 by AIproj Loading… updated Apr 15, 2024
Create cmake-multi-platform.yml
#1201 opened Apr 22, 2024 by Romario242003 Loading… updated Apr 23, 2024
[muP] Rework
#1087 opened Dec 1, 2023 by lintangsutawika Draft updated May 2, 2024
Added infinite lr schedules merge-queue This PR is next on the queue to merge
#1194 opened Mar 25, 2024 by kshitijkg Loading… updated May 14, 2024
Add lora support
#1225 opened May 20, 2024 by mkerin Draft updated May 20, 2024
Dmoe integration
#1210 opened May 6, 2024 by DayOfThePenguin Loading… updated May 22, 2024
Add Transformer Engine
#1213 opened May 10, 2024 by Quentin-Anthony Draft updated May 28, 2024
Add Transformer Engine's version of RMSNorm and LayerNorm
#1235 opened Jun 11, 2024 by lintangsutawika Draft updated Jun 11, 2024
Deepspeed benchmarking
#878 opened Apr 11, 2023 by cr458 Draft updated Jun 18, 2024
Add tensor parallelism for RWKV
#1237 opened Jun 19, 2024 by jahatef Draft updated Jun 19, 2024
Add intermediate_size to GPT-NeoX models
#1212 opened May 10, 2024 by dtamayo-nlp Loading… updated Jun 20, 2024
SFT improvements (labeling fixes, different packing implementations)
#1240 opened Jun 21, 2024 by dmahan93 Loading… updated Jun 25, 2024
Add DPO training
#1242 opened Jun 25, 2024 by dmahan93 Loading… updated Jun 26, 2024
ProTip! Mix and match filters to narrow down what you’re looking for.