-
Notifications
You must be signed in to change notification settings - Fork 930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a hyperparameter value in gpt-neox #161
Comments
ggerganov
pushed a commit
that referenced
this issue
May 17, 2023
CCLDArjun
pushed a commit
to CCLDArjun/ggml
that referenced
this issue
Dec 18, 2023
Without "static" prefix, it fails to compile in clang
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In GPTNeoXConfig, ‘use_parallel_residual' argument was not initially present. (huggingface/transformers@71e6027)
After 4 months, the 'use_parallel_residual' argument was introduced. (huggingface/transformers@226b0e4)
Models that were trained and uploaded using the previous commit do not include the "use_parallel_residual" argument in their "config.json" files, thus, output the following errors when converting to a ggml format.
File "../examples/gpt-neox/convert-h5-to-ggml.py", line 61, in <module> fout.write(struct.pack("i", hparams["use_parallel_residual"])) KeyError: 'use_parallel_residual'
Such models include
https://huggingface.co/EleutherAI/polyglot-ko-1.3b ,
https://huggingface.co/EleutherAI/polyglot-ko-5.8b ,
https://huggingface.co/EleutherAI/polyglot-ko-12.8b
Since the default value for the parameter is True (https://github.com/huggingface/transformers/blob/d941f07a4e3bc7b61b7afbd25d6e2e8427fccc6d/src/transformers/models/gpt_neox/configuration_gpt_neox.py#L29), which corresponds to the implementations in the previous commit, it should be set to True.
The text was updated successfully, but these errors were encountered: