-
Notifications
You must be signed in to change notification settings - Fork 977
Issues: EleutherAI/gpt-neox
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
'attention.bias' and 'attention.masked_bias' not in Something isn't working
hf_layer.state_dict()
when converting gpt-neox model to huggingface
bug
#1013
by johntzwei
was closed Sep 13, 2023
Issues installing - Bare with me please
bug
Something isn't working
#1006
by MistakingManx
was closed Jul 28, 2023
Is there a way to disable data sampling?
feature request
New feature or request
#1005
by haozhouamzn
was closed Apr 21, 2024
RotaryEmbedding computation is wrong for certain position/feature pairs in reduced precision (both fp16 and bfloat)
bug
Something isn't working
#1003
by cbcase
was closed Sep 25, 2023
The class with the same name was imported twice
bug
Something isn't working
#999
by D-X-Y
was closed Sep 25, 2023
The xformers result can not match with norm attention result
feature request
New feature or request
#998
by guozhiyao
was closed Jul 20, 2023
how to use when --mask-before-token have values
feature request
New feature or request
#995
by xealml
was closed Oct 4, 2023
A clearer explanation of data-weights with more details
feature request
New feature or request
#992
by leocnj
was closed Jul 8, 2023
NCCL backend in DeepSpeed not yet implemented
bug
Something isn't working
#990
by jiezhangGt
was closed Jul 5, 2023
Training gpt stuck at the beginning
feature request
New feature or request
#988
by jiezhangGt
was closed Jul 30, 2023
Distributed training with model parallelism hangs with the recent PR
bug
Something isn't working
#985
by absol13
was closed Jul 10, 2023
Any plans to implement multi-query attention
feature request
New feature or request
#982
by crazyofapple
was closed Jun 27, 2023
input decoding error on the args string of train.py
bug
Something isn't working
#977
by nevakrien
was closed Jun 27, 2023
train shows an unexpected exception in best_download no explanation given
bug
Something isn't working
#976
by nevakrien
was closed Jul 24, 2023
WARNING: shuffle index length is not equal to sample index length
bug
Something isn't working
#972
by 1ittlesnow
was closed Jun 22, 2023
misindexing when converting llama weights to gpt-neox format
bug
Something isn't working
#971
by CRSilkworth
was closed Feb 28, 2024
Any plans on supporting modeling_tf_gpt_neox to hugging face transformers models
feature request
New feature or request
#970
by praneethgb
was closed Jun 7, 2023
Substantial decrease in FLOPs per GPU when training multinode
#965
by davidvblumenthal
was closed Jul 11, 2023
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.