Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot convert raw llama weights to NeoX #96

Open
scikkk opened this issue Oct 30, 2023 · 2 comments
Open

cannot convert raw llama weights to NeoX #96

scikkk opened this issue Oct 30, 2023 · 2 comments

Comments

@scikkk
Copy link

scikkk commented Oct 30, 2023

Hello! Thanks for your great work, but I met some problems when trying to replicate the results.

Specifically, I cannot find convert_raw_llama_weights_to_hf.py as depicted in README.md .

However, I found convert_raw_llama_weights_to_neox.py, which seems can convert Meta->NeoX format.

But the python script doesn't support --config_file, so I use --model_size=7B instead. Unfortunately, I met an error:

(gptneox2) root@ebf9662e:/math-lm# bash convert.sh 
sequential
dict_keys(['dim', 'n_layers', 'n_heads', 'multiple_of', 'ffn_dim_multiplier', 'norm_eps', 'rope_theta'])
Traceback (most recent call last):
  File "gpt-neox/tools/convert_raw_codellama_weights_to_neox.py", line 650, in <module>
    main()
  File "gpt-neox/tools/convert_raw_codellama_weights_to_neox.py", line 641, in main
    convert_model_sequential(
  File "gpt-neox/tools/convert_raw_codellama_weights_to_neox.py", line 308, in convert_model_sequential
    num_kv_heads = params["n_kv_heads"]
KeyError: 'n_kv_heads'

Here is my convert.sh:

python convert_raw_codellama_weights_to_neox.py \
 --input_dir /math-lm/codellama \
 --model_size 7B \
 --output_dir /math-lm/codellama/7B-NeoX \
 --num_output_shards 2

Here is my raw codellama7b:

(gptneox2) root@ebf9662e:/math-lm/codellama/7B# tree .
.
├── checklist.chk
├── consolidated.00.pth
├── params.json
└── tokenizer.model

Looking forward to your reply, any help will be appreciated!

@scikkk
Copy link
Author

scikkk commented Oct 30, 2023

I made 2 changes and the conversion secceeded:

# LINE 308: num_kv_heads = params["n_kv_heads"]
num_kv_heads = params["n_kv_heads"] if "n_kv_heads" in params else params["n_heads"]
# LINE 383: if model_size == "7B":
if model_size == "7B" and "layers.0.attention.inner_attention.rope.freqs" in loaded[0]:

@andrewarrow
Copy link

andrewarrow commented Nov 26, 2023

I'm stuck at the codellama/7B/params.json file not being json. It's an HTML file?

    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

When I view the html file in a browser, it's a "Sign in to continue to Gmail" login page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants