Skip to content

Pull requests: EleutherAI/gpt-neox

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add Rotary Positional Embedding
#213 by sdtblck was merged Apr 7, 2021 Loading…
Draft PR Adding mistral 0.1
#1131 by AIproj was merged Feb 23, 2024 Loading…
integrated flash attention 2
#1035 by a663E-36z1120 was merged Sep 20, 2023 Loading…
fused layernorm
#1105 by yang was merged Jan 26, 2024 Loading…
Add MoE
#1129 by yang was merged Mar 7, 2024 Loading…
Mamba + Tensor Parallel Support
#1184 by haileyschoelkopf was merged Mar 15, 2024 Loading…
Add megablocks dropless MoE
#1192 by yang was merged May 4, 2024 Loading…
Jaimemcc intel/ci composite cpu tests
#1205 by jaimemcc-intel was merged May 4, 2024 Loading…
LR scheduler fix no longer breaks inference
#1060 by dashstander was merged Oct 17, 2023 Loading…
Lion Optimizer
#1062 by andylolu2 was merged Oct 20, 2023 Loading…
PR for Deepspeed Integration
#9 by trisongz was merged Dec 24, 2020 Loading…
get rid of test file
#10 by sdtblck was merged Dec 26, 2020 Loading…
make mask value smaller by factor of 2
#25 by lucidrains was merged Jan 4, 2021 Loading…
test
#1 by lucidrains was merged Dec 22, 2020 Loading…
Update base_model.json
#93 by srulikbd was merged Jan 26, 2021 Loading…
Implement distributed training using Kubernetes
#77 by leogao2 was merged Jan 23, 2021 Loading…
2
6
Batch size needs to be specified
#87 by joshlk was merged Jan 23, 2021 Loading…
Add checkpoint saving / loading
#90 by sdtblck was merged Jan 28, 2021 Loading…
Remove layer caching
#109 by joshlk was merged Feb 1, 2021 Loading…
ProTip! Updated in the last three days: updated:>2024-07-09.