gguf-py: Support 01.AI Yi models #3943

KerfuffleV2 · 2023-11-04T18:40:03Z

Tiny change to support Yi model layernorm tensor names. Architecturally, it's the same as LLaMA2. See 01-ai/Yi#1

The model has impressively high MMLU results, higher than any 70B models actually. Whether it's valid or translates to real world results, I don't know.

I successfully converted this model: https://huggingface.co/01-ai/Yi-34B (note that there's both Safetensors and PyTorch versions in the same repo, so be careful unless you actually want to download two copies of the model).

Quantized version runs just fine. I didn't test with the 6B but looking at the tensor index it looks the same.

@TheBloke - tagging in case you're interested in trying to convert this one.

TheBloke · 2023-11-04T20:18:33Z

@TheBloke - tagging in case you're interested in trying to convert this one.

Oh go on then, you twisted my arm!

Working fine, thanks very much! Quants uploading now.

tastypear · 2023-11-21T06:21:39Z

@KerfuffleV2 Is there any chance to support vivo's BlueLM?

This model have two tensors named model.embed_layer_norm.weight and model.embed_layer_norm.bias that llama.cpp cannot recognize.

...
Permuting layer 31
model.embed_tokens.weight                        -> token_embd.weight                        | BF16   | [100096, 4096]
Traceback (most recent call last):
  File "/content/./llama.cpp/convert.py", line 1230, in <module>
    main()
  File "/content/./llama.cpp/convert.py", line 1216, in main
    model   = convert_model_names(model, params)
  File "/content/./llama.cpp/convert.py", line 1006, in convert_model_names
    raise Exception(f"Unexpected tensor name: {name}")
Exception: Unexpected tensor name: model.embed_layer_norm.weight

Someone said this problem is similar with Yi model(a Chinese forum), but I can't fix it.

KerfuffleV2 · 2023-11-21T07:28:28Z

@tastypear It looks like that one might need its own model architecture. Yi just had the exact same architecture except two tensors were named differently so that was really easy to fix.

I looked at the forum you linked. I have a harder time reading non-unified diffs than the Chinese part! What kind of monster uses non-unified diffs? But it seems like they had to mess around with the C++ code as well.

You can try editing gguf-py/gguf/tensor_mapping.py and adding "model.embed_layer_norm" to the MODEL_TENSOR.OUTPUT_NORM dict.

Like:

        # Output norm
        MODEL_TENSOR.OUTPUT_NORM: (
            "gpt_neox.final_layer_norm",               # gptneox
            "model.embed_layer_norm",                  # BlueLM <-- new

tastypear · 2023-11-21T14:18:59Z

@KerfuffleV2 very close, but...
convert log:

...
Permuting layer 31
model.embed_layer_norm.weight                    -> output_norm.weight                       | BF16   | [4096]
model.embed_layer_norm.bias                      -> output_norm.bias                          | BF16   | [4096]
...
model.norm.weight                                -> output_norm.weight                       | BF16   | [4096]
...

There are 2 tensors named output_norm.weight, so one of them is missing when loading .gguf.

error loading model: done_getting_tensors: wrong number of tensors; expected 292, got 291

KerfuffleV2 · 2023-11-29T20:41:40Z

@tastypear Sorry for the slow reply. I kept meaning to get back to this. Unfortunately, I don't really know what would be required to fix that issue and I probably won't have the time to mess with it. Hopefully you or someone else will manage to fix it. It seems like it needs considerably more to fix than the Yi model which was exactly the same as LLaMA except two tensors had different names.

Anyway, not much of a response but I didn't want to just ignore you.

tastypear · 2023-12-06T16:11:18Z

@KerfuffleV2 I just happened to see someone mention a related model and thought it would be easy to solve, so I took the liberty to ask. Thank you for taking the time to reply to me😉

gguf-py: Support 01.AI Yi models

bc068ab

KerfuffleV2 added model Model specific script Script related labels Nov 4, 2023

Galunid approved these changes Nov 4, 2023

View reviewed changes

KerfuffleV2 merged commit f28af0d into ggerganov:master Nov 4, 2023
6 checks passed

wangye01inf mentioned this pull request Nov 7, 2023

Yi-34B 需要的资源是多少？ 01-ai/Yi#55

Closed

KerfuffleV2 deleted the feat-gguf-yi-support branch November 17, 2023 03:12

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023

gguf-py: Support 01.AI Yi models (ggerganov#3943)

864a388

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf-py: Support 01.AI Yi models #3943

gguf-py: Support 01.AI Yi models #3943

KerfuffleV2 commented Nov 4, 2023 •

edited

Loading

TheBloke commented Nov 4, 2023

tastypear commented Nov 21, 2023

KerfuffleV2 commented Nov 21, 2023

tastypear commented Nov 21, 2023 •

edited

Loading

KerfuffleV2 commented Nov 29, 2023

tastypear commented Dec 6, 2023

gguf-py: Support 01.AI Yi models #3943

gguf-py: Support 01.AI Yi models #3943

Conversation

KerfuffleV2 commented Nov 4, 2023 • edited Loading

TheBloke commented Nov 4, 2023

tastypear commented Nov 21, 2023

KerfuffleV2 commented Nov 21, 2023

tastypear commented Nov 21, 2023 • edited Loading

KerfuffleV2 commented Nov 29, 2023

tastypear commented Dec 6, 2023

KerfuffleV2 commented Nov 4, 2023 •

edited

Loading

tastypear commented Nov 21, 2023 •

edited

Loading