Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLIPVisionModelWithProjection Shape Size Error #6

Open
RobeSantoro opened this issue May 13, 2024 · 2 comments
Open

CLIPVisionModelWithProjection Shape Size Error #6

RobeSantoro opened this issue May 13, 2024 · 2 comments

Comments

@RobeSantoro
Copy link

Unfortunately, running any of the example workflows I get the following error:

Error occurred when executing AniPortrait_Pose_Gen_Video:

Error(s) in loading state_dict for CLIPVisionModelWithProjection:
size mismatch for vision_model.embeddings.class_embedding: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for vision_model.embeddings.patch_embedding.weight: copying a param with shape torch.Size([1024, 3, 14, 14]) from checkpoint, the shape in current model is torch.Size([768, 3, 32, 32]).
size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([257, 1024]) from checkpoint, the shape in current model is torch.Size([50, 768]).
size mismatch for vision_model.pre_layrnorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for vision_model.pre_layrnorm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
[...]
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

File "E:\COMFY\ComfyUI-robe\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\custom_nodes\ComfyUI_Aniportrait\nodes.py", line 169, in pose_generate_video
image_enc = CLIPVisionModelWithProjection.from_pretrained(image_encoder_path).to(dtype=weight_dtype, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxx\.conda\envs\comfy\Lib\site-packages\transformers\modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxx\.conda\envs\comfy\Lib\site-packages\transformers\modeling_utils.py", line 4155, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")

Is this related to my torch or transformers version?
I'm running transformers 4.40.2 and torch 2.3.0+cu118

Can you please help me to fix it?

@RobeSantoro
Copy link
Author

I can also add the output before the error:

got prompt
[rgthree] Using rgthree's optimized recursive execution.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
 ['conv_norm_out.weight, conv_norm_out.bias, conv_out.weight, conv_out.bias']
loaded temporal unet's pretrained weights from E:\COMFY\ComfyUI-robe\custom_nodes\ComfyUI_Aniportrait\pretrained_model\stable-diffusion-v1-5\unet ...
Load motion module params from E:\COMFY\ComfyUI-robe\custom_nodes\ComfyUI_Aniportrait\pretrained_model\motion_module.pth
Loaded 453.20928M-parameter motion module

and that I'm running diffusers 0.26.2

@frankchieng
Copy link
Owner

make sure the pretrained_model file is completed otherwise u'd better re-download these models,btw,pay attention to the reference image and video size, they should be all the square shape

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants