Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf-py: Support 01.AI Yi models #3943

Merged
merged 1 commit into from
Nov 4, 2023

Conversation

KerfuffleV2
Copy link
Collaborator

@KerfuffleV2 KerfuffleV2 commented Nov 4, 2023

Tiny change to support Yi model layernorm tensor names. Architecturally, it's the same as LLaMA2. See 01-ai/Yi#1

The model has impressively high MMLU results, higher than any 70B models actually. Whether it's valid or translates to real world results, I don't know.

I successfully converted this model: https://huggingface.co/01-ai/Yi-34B (note that there's both Safetensors and PyTorch versions in the same repo, so be careful unless you actually want to download two copies of the model).

Quantized version runs just fine. I didn't test with the 6B but looking at the tensor index it looks the same.

@TheBloke - tagging in case you're interested in trying to convert this one.

@KerfuffleV2 KerfuffleV2 added model Model specific script Script related labels Nov 4, 2023
@TheBloke
Copy link
Contributor

TheBloke commented Nov 4, 2023

@TheBloke - tagging in case you're interested in trying to convert this one.

Oh go on then, you twisted my arm!

Working fine, thanks very much! Quants uploading now.

@KerfuffleV2 KerfuffleV2 merged commit f28af0d into ggerganov:master Nov 4, 2023
6 checks passed
@KerfuffleV2 KerfuffleV2 deleted the feat-gguf-yi-support branch November 17, 2023 03:12
@tastypear
Copy link

@KerfuffleV2 Is there any chance to support vivo's BlueLM?

This model have two tensors named model.embed_layer_norm.weight and model.embed_layer_norm.bias that llama.cpp cannot recognize.

...
Permuting layer 31
model.embed_tokens.weight                        -> token_embd.weight                        | BF16   | [100096, 4096]
Traceback (most recent call last):
  File "/content/./llama.cpp/convert.py", line 1230, in <module>
    main()
  File "/content/./llama.cpp/convert.py", line 1216, in main
    model   = convert_model_names(model, params)
  File "/content/./llama.cpp/convert.py", line 1006, in convert_model_names
    raise Exception(f"Unexpected tensor name: {name}")
Exception: Unexpected tensor name: model.embed_layer_norm.weight

Someone said this problem is similar with Yi model(a Chinese forum), but I can't fix it.

@KerfuffleV2
Copy link
Collaborator Author

@tastypear It looks like that one might need its own model architecture. Yi just had the exact same architecture except two tensors were named differently so that was really easy to fix.

I looked at the forum you linked. I have a harder time reading non-unified diffs than the Chinese part! What kind of monster uses non-unified diffs? But it seems like they had to mess around with the C++ code as well.

You can try editing gguf-py/gguf/tensor_mapping.py and adding "model.embed_layer_norm" to the MODEL_TENSOR.OUTPUT_NORM dict.

Like:

        # Output norm
        MODEL_TENSOR.OUTPUT_NORM: (
            "gpt_neox.final_layer_norm",               # gptneox
            "model.embed_layer_norm",                  # BlueLM <-- new

@tastypear
Copy link

tastypear commented Nov 21, 2023

@KerfuffleV2 very close, but...
convert log:

...
Permuting layer 31
model.embed_layer_norm.weight                    -> output_norm.weight                       | BF16   | [4096]
model.embed_layer_norm.bias                      -> output_norm.bias                          | BF16   | [4096]
...
model.norm.weight                                -> output_norm.weight                       | BF16   | [4096]
...

There are 2 tensors named output_norm.weight, so one of them is missing when loading .gguf.

error loading model: done_getting_tensors: wrong number of tensors; expected 292, got 291

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
@KerfuffleV2
Copy link
Collaborator Author

@tastypear Sorry for the slow reply. I kept meaning to get back to this. Unfortunately, I don't really know what would be required to fix that issue and I probably won't have the time to mess with it. Hopefully you or someone else will manage to fix it. It seems like it needs considerably more to fix than the Yi model which was exactly the same as LLaMA except two tensors had different names.

Anyway, not much of a response but I didn't want to just ignore you.

@tastypear
Copy link

@KerfuffleV2 I just happened to see someone mention a related model and thought it would be easy to solve, so I took the liberty to ask. Thank you for taking the time to reply to me😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model Model specific script Script related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants