-
Notifications
You must be signed in to change notification settings - Fork 984
Issues: EleutherAI/gpt-neox
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
continue training from a checkpoint with different number of gpu/node
#1158
by mackmake
was closed Mar 15, 2024
Cannot perform inference, be it unconditional. input-file or interactive
bug
Something isn't working
#1228
by srivassid
was closed May 30, 2024
The results of running eval show only 1 digit after decimal point for acc on all tested tasks
bug
Something isn't working
#1227
by lernerjenny
was closed Jul 9, 2024
'intermediate_size' not set in tools/ckpts/convert_neox_to_hf.py for neox model architecture
bug
Something isn't working
#1208
by jvendrow
was closed May 4, 2024
is there any ignore_index ability in the loss calculation?
feature request
New feature or request
#1193
by exnx
was closed Apr 21, 2024
Large model instantiation using New feature or request
DeepSpeed.zero.Init
under ZeRO-3
feature request
#1189
by R0n12
was closed Mar 19, 2024
MoE loss variable not defined in gpt j residual code path
bug
Something isn't working
#1174
by tf-nv
was closed Mar 8, 2024
بهترین تعمیرگاه موبایل در مشهد مقدس
bug
Something isn't working
#1173
by rezaarefi
was closed Apr 19, 2024
pipe_parallel_size = 1 using DeepSpeed PipelineEngine
bug
Something isn't working
#1172
by DayOfThePenguin
was closed Mar 6, 2024
Add Basic RWKV Block to GPT-NeoX
feature request
New feature or request
#1167
by Quentin-Anthony
was closed Jun 19, 2024
4 tasks done
Dockerfile installation fails to run pythia 14M
bug
Something isn't working
#1165
by tf-nv
was closed Mar 4, 2024
Is there a way to train on the entire dataset for N epochs without specifying train-iters?
#1164
by javirandor
was closed Mar 18, 2024
Converting Pythia checkpoint from HF to NeoX fails
bug
Something isn't working
#1161
by malteos
was closed Mar 4, 2024
Add PyTorch Memory Profiler
feature request
New feature or request
#1152
by Quentin-Anthony
was closed Feb 21, 2024
Add basic Mamba block
feature request
New feature or request
#1148
by Quentin-Anthony
was closed Mar 10, 2024
3 of 4 tasks
NCCL error in: ProcessGroupNCCL.cpp:1269, internal error, NCCL version 2.14.3
#1147
by mackmake
was closed Apr 21, 2024
Update to current versions of python and pytorch
feature request
New feature or request
#1143
by segyges
was closed Feb 23, 2024
Port NVIDIA Nsight profiling to gpt-neox
feature request
New feature or request
#1134
by Quentin-Anthony
was closed Feb 23, 2024
1 of 2 tasks
FileNotFoundError thrown when training
bug
Something isn't working
#1127
by obicons
was closed Apr 21, 2024
[BUG] Setting Finetune=True causes checkpoint loading to not work correctly
bug
Something isn't working
#1121
by exnx
was closed Mar 1, 2024
ProTip!
no:milestone will show everything without a milestone.