Missing key "lm_head.weight" in GemmaForCausalLM when loading lora finetuned TinyLLaVA-Gemma-SigLIP-2.4B #88

Yuki-Kokomi · 2024-06-22T02:01:39Z

When attempting to merge LoRA weights into the TinyLLaVA-Gemma-SigLIP-2.4B model, I encountered a RuntimeError due to a missing key lm_head.weight in the GemmaForCausalLM state_dict. The specific error traceback is as follows:

Traceback (most recent call last):
  File "/../TinyLLaVA_Factory/merge_lora_weights.py", line 27, in <module>
    merge_lora(args)
  File "/../TinyLLaVA_Factory/merge_lora_weights.py", line 14, in merge_lora
    model, tokenizer,  image_processor, context_len = load_pretrained_model(args.model_path)
  File "/../TinyLLaVA_Factory/tinyllava/model/load_model.py", line 45, in load_pretrained_model
    model.language_model.load_state_dict(language_model_ckp)
  File "/../miniforge3/envs/tinyllava_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for GemmaForCausalLM:
        Missing key(s) in state_dict: "lm_head.weight".

This issue seems similar to vllm-project/vllm#3323. Any insights or solutions to resolve the missing "lm_head.weight" key would be greatly appreciated.

The text was updated successfully, but these errors were encountered:

ggcr · 2024-06-22T09:08:34Z

This just happened to me also. After the pre-training phase I am trying to perform inference however seems there is a miss-match between the saved model on /<model_path>/language_model/pytorch_model.bin

ggcr · 2024-06-22T10:00:03Z

A quick work around for now would be to follow the solkution made on the issue vllm-project/vllm#3323 linked by @Yuki-Kokomi, by copying the embed_token weight onto the lm_head:

        language_model_ckp_path = os.path.join(model_name_or_path, 'language_model/pytorch_model.bin')
        language_model_ckp = load_base_ckp_for_lora(language_model_ckp_path)

        # This line is what does the trick
        language_model_ckp['lm_head.weight'] = language_model_ckp['model.embed_tokens.weight']

        model.language_model.load_state_dict(language_model_ckp)

However, I am not able to qualitatively validate that this fix works fine.

Yuki-Kokomi · 2024-06-23T03:02:07Z

Thank you @ggcr for the solution. It works perfectly.

ggcr · 2024-06-23T16:50:03Z

I can provide a PR by doing this checking if the LLM backbone used is Gemma if the authors want.

YingHuTsing · 2024-06-24T13:35:32Z

I can provide a PR by doing this checking if the LLM backbone used is Gemma if the authors want.

Hi, we do encourage you to initiate a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing key "lm_head.weight" in GemmaForCausalLM when loading lora finetuned TinyLLaVA-Gemma-SigLIP-2.4B #88

Missing key "lm_head.weight" in GemmaForCausalLM when loading lora finetuned TinyLLaVA-Gemma-SigLIP-2.4B #88

Yuki-Kokomi commented Jun 22, 2024

ggcr commented Jun 22, 2024

ggcr commented Jun 22, 2024

Yuki-Kokomi commented Jun 23, 2024

ggcr commented Jun 23, 2024

YingHuTsing commented Jun 24, 2024

Missing key "lm_head.weight" in GemmaForCausalLM when loading lora finetuned TinyLLaVA-Gemma-SigLIP-2.4B #88

Missing key "lm_head.weight" in GemmaForCausalLM when loading lora finetuned TinyLLaVA-Gemma-SigLIP-2.4B #88

Comments

Yuki-Kokomi commented Jun 22, 2024

ggcr commented Jun 22, 2024

ggcr commented Jun 22, 2024

Yuki-Kokomi commented Jun 23, 2024

ggcr commented Jun 23, 2024

YingHuTsing commented Jun 24, 2024