Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a hyperparameter value in gpt-neox #161

Closed
jaeminSon opened this issue May 17, 2023 · 0 comments · Fixed by #162
Closed

Fix a hyperparameter value in gpt-neox #161

jaeminSon opened this issue May 17, 2023 · 0 comments · Fixed by #162

Comments

@jaeminSon
Copy link
Contributor

jaeminSon commented May 17, 2023

In GPTNeoXConfig, ‘use_parallel_residual' argument was not initially present. (huggingface/transformers@71e6027)

After 4 months, the 'use_parallel_residual' argument was introduced. (huggingface/transformers@226b0e4)

Models that were trained and uploaded using the previous commit do not include the "use_parallel_residual" argument in their "config.json" files, thus, output the following errors when converting to a ggml format.

File "../examples/gpt-neox/convert-h5-to-ggml.py", line 61, in <module> fout.write(struct.pack("i", hparams["use_parallel_residual"])) KeyError: 'use_parallel_residual'

Such models include
https://huggingface.co/EleutherAI/polyglot-ko-1.3b ,
https://huggingface.co/EleutherAI/polyglot-ko-5.8b ,
https://huggingface.co/EleutherAI/polyglot-ko-12.8b

Since the default value for the parameter is True (https://github.com/huggingface/transformers/blob/d941f07a4e3bc7b61b7afbd25d6e2e8427fccc6d/src/transformers/models/gpt_neox/configuration_gpt_neox.py#L29), which corresponds to the implementations in the previous commit, it should be set to True.

CCLDArjun pushed a commit to CCLDArjun/ggml that referenced this issue Dec 18, 2023
Without "static" prefix, it fails to compile in clang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant