Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage] Some Weights not used, when loaded in eval mmbench. #672

Closed
shipengai opened this issue Oct 26, 2023 · 9 comments
Closed

[Usage] Some Weights not used, when loaded in eval mmbench. #672

shipengai opened this issue Oct 26, 2023 · 9 comments

Comments

@shipengai
Copy link

Describe the issue

Issue:
When I use my second stage trained model , there are some logs

Command:

python -m llava.eval.model_vqa_mmbench \
    --model-path ./checkpoints/llava-v1.5-7b \
    --question-file ./playground/data/eval/mmbench/$SPLIT.tsv \
    --answers-file ./playground/data/eval/mmbench/answers/$SPLIT/llava-v1.5-7b.jsonl \
    --single-pred-prompt \
    --temperature 0 \
    --conv-mode vicuna_v1

Log:

Some weights of the model checkpoint at./checkpoints/llava-v1.5-7b were not used when initializing LlavaLlamaForCausalLM: ['model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 

Screenshots:
You may attach screenshots if it better explains the issue.

@shipengai shipengai changed the title [Usage] Error in eval. [Usage] Some Weights not used, when loaded in eval mmbench. Oct 26, 2023
@haotian-liu
Copy link
Owner

Is the checkpoint trained by yourself? If so, this is expected, as DeepSpeed saves the frozen vision encoder weights as well. If your results are normal, than you can safely ignore this warning.

@shipengai
Copy link
Author

shipengai commented Oct 26, 2023

@haotian-liu ,Yes, the checkpoint is trained myself. Thanks your reply. But I found that when use your released checkpoint , there are not such logs.

@haotian-liu
Copy link
Owner

If you want to remove the vision tower as the checkpoint we released, you can do this:

python -m llava.model.consolidate --src model --dst model_consolidate

The model prediction would be the same regardless of you do anything like that.

@annopackage
Copy link

Is the checkpoint trained by yourself? If so, this is expected, as DeepSpeed saves the frozen vision encoder weights as well. If your results are normal, than you can safely ignore this warning.

Hi, why is vision_tower not initialized through model.from_pretrain(model_name_or_path) since getattr('vision_tower') is true and there is state_dict in checkpoint?

@CrossLee1
Copy link

@haotian-liu as for the mmbench dataset, gt answers are provided in mmbench_dev_20230712.tsv
why do you upload the results to the evaluation server, rather than calculating offline?

@TianyunYoung
Copy link

@haotian-liu as for the mmbench dataset, gt answers are provided in mmbench_dev_20230712.tsv why do you upload the results to the evaluation server, rather than calculating offline?

@haotian-liu I have the same question, hhh

@ppalantir
Copy link

@haotian-liu as for the mmbench dataset, gt answers are provided in mmbench_dev_20230712.tsv why do you upload the results to the evaluation server, rather than calculating offline?

Hi @CrossLee1, I also found the gt answers, but the accuracy I calculated is much higher than reported. could you please give me some suggestions? thanks

@dacian7
Copy link

dacian7 commented Aug 20, 2024

Hi, why is vision_tower not initialized through model.from_pretrain(model_name_or_path) since getattr('vision_tower') is true and there is state_dict in checkpoint?

@annopackage Same question here... have you figured it out?

@sceliay
Copy link

sceliay commented Nov 15, 2024

When I was loading the LORA fine-tuned model, I encountered this issue as well. The message said: 'Some weights of the model checkpoint at [my lora model] were not used when initializing LlavaLlamaForCausalLM.' I also tried fine-tuned models with different numbers of iterations, but the result was the same. It seems like the weights from the LORA fine-tuning were not loaded?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants