Skip to content

Pull requests: EleutherAI/gpt-neox

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Conversion script bugfixes
#1218 by haileyschoelkopf was merged Jun 7, 2024 Loading…
Mamba + Tensor Parallel Support
#1184 by haileyschoelkopf was merged Mar 15, 2024 Loading…
Switch to using Cuda Flash Attn for Alibi
#1183 by haileyschoelkopf was merged Mar 13, 2024 Loading…
remove best_download as dependency
#1179 by haileyschoelkopf was merged Mar 8, 2024 Loading…
Make rotary freqs buffer non-persistent
#1168 by haileyschoelkopf was merged Mar 4, 2024 Loading…
Improve argument validation for Flash-attn + SWA
#1162 by haileyschoelkopf was merged Mar 2, 2024 Loading…
Add Mamba Architecture
#1157 by haileyschoelkopf was merged Mar 10, 2024 Loading…
[Bug?] Fix profiling argument names
#1155 by haileyschoelkopf was merged Feb 26, 2024 Loading…
Update lm_eval v0.4 to PyPI dependencies
#1141 by haileyschoelkopf was merged Feb 1, 2024 Loading…
Enable passing of --account to srun / SlurmLauncher
#1126 by haileyschoelkopf was merged Jan 19, 2024 Loading…
Improve Conversion Utilities
#1124 by haileyschoelkopf was merged Feb 8, 2024 Loading…
9 tasks done
2
1
Lm eval 0.4.0 support
#1101 by haileyschoelkopf was merged Dec 23, 2023 Loading…
Pin version of lm_eval
#1070 by haileyschoelkopf was merged Nov 1, 2023 Loading…
Edge-casing for multi-GPU HF-to-NeoX conversion
#1065 by haileyschoelkopf was merged Nov 1, 2023 Loading…
Add WandB report to Llemma Readme
#1063 by haileyschoelkopf was merged Oct 23, 2023 Loading…
Add s3 checkpoint syncing
#1010 by haileyschoelkopf was merged Sep 23, 2023 Loading…
Hacky FlashAttn 2 support
#1000 by haileyschoelkopf was closed Sep 22, 2023 Draft
[Math-LM] No load scheduler
#969 by haileyschoelkopf was merged Jun 7, 2023 Loading…
LLaMA-to-HF conversion
#960 by haileyschoelkopf was closed Feb 21, 2024 Draft
Minor Usability Fixes for HF Conversion
#853 by haileyschoelkopf was merged Mar 23, 2023 Loading…
Add tiktoken Tokenizer support
#779 by haileyschoelkopf was merged Jan 31, 2023 Loading…
Default use_cache=False in eval harness integration
#774 by haileyschoelkopf was merged Feb 3, 2023 Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.