Skip to content

Issues: EleutherAI/gpt-neox

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

Add Basic RWKV Block to GPT-NeoX feature request New feature or request
#1167 by Quentin-Anthony was closed Jun 19, 2024 updated Jun 19, 2024
4 tasks done
Cannot perform inference, be it unconditional. input-file or interactive bug Something isn't working
#1228 by srivassid was closed May 30, 2024 updated May 30, 2024
'intermediate_size' not set in tools/ckpts/convert_neox_to_hf.py for neox model architecture bug Something isn't working
#1208 by jvendrow was closed May 4, 2024 updated May 4, 2024
distributed training with multipy nodes. bug Something isn't working
#734 by cdj0311 was closed Dec 28, 2022 updated Apr 22, 2024
When llama uses bf16 training, there is an abnormal loss bug Something isn't working
#947 by suolyer was closed Apr 21, 2024 updated Apr 21, 2024
Officially Support AMD GPUs feature request New feature or request
#954 by Quentin-Anthony was closed Apr 21, 2024 updated Apr 21, 2024
4 tasks done
Is there a way to disable data sampling? feature request New feature or request
#1005 by haozhouamzn was closed Apr 21, 2024 updated Apr 21, 2024
FileNotFoundError thrown when training bug Something isn't working
#1127 by obicons was closed Apr 21, 2024 updated Apr 21, 2024
How to convert gpt-neox to llama architecture..?
#1151 by yuri-son was closed Apr 21, 2024 updated Apr 21, 2024
is there any ignore_index ability in the loss calculation? feature request New feature or request
#1193 by exnx was closed Apr 21, 2024 updated Apr 21, 2024
بهترین تعمیرگاه موبایل در مشهد مقدس bug Something isn't working
#1173 by rezaarefi was closed Apr 19, 2024 updated Apr 19, 2024
Large model instantiation using DeepSpeed.zero.Init under ZeRO-3 feature request New feature or request
#1189 by R0n12 was closed Mar 19, 2024 updated Mar 19, 2024
can you provide pre-built images for main branch feature request New feature or request
#1019 by xu-song was closed Mar 17, 2024 updated Mar 17, 2024
continue training from a checkpoint with different number of gpu/node
#1158 by mackmake was closed Mar 15, 2024 updated Mar 15, 2024
Add basic Mamba block feature request New feature or request
#1148 by Quentin-Anthony was closed Mar 10, 2024 updated Mar 10, 2024
3 of 4 tasks
MoE loss variable not defined in gpt j residual code path bug Something isn't working
#1174 by tf-nv was closed Mar 8, 2024 updated Mar 8, 2024
Add Mixture of Experts feature request New feature or request
#479 by sdtblck was closed Mar 7, 2024 updated Mar 7, 2024
3
misindexing when converting llama weights to gpt-neox format bug Something isn't working
#971 by CRSilkworth was closed Feb 28, 2024 updated Mar 6, 2024
pipe_parallel_size = 1 using DeepSpeed PipelineEngine bug Something isn't working
#1172 by DayOfThePenguin was closed Mar 6, 2024 updated Mar 6, 2024
Converting Pythia checkpoint from HF to NeoX fails bug Something isn't working
#1161 by malteos was closed Mar 4, 2024 updated Mar 4, 2024
Dockerfile installation fails to run pythia 14M bug Something isn't working
#1165 by tf-nv was closed Mar 4, 2024 updated Mar 4, 2024
[BUG] Setting Finetune=True causes checkpoint loading to not work correctly bug Something isn't working
#1121 by exnx was closed Mar 1, 2024 updated Mar 1, 2024
ProTip! Find all open issues with in progress development work with linked:pr.