CLIPVisionModelWithProjection Shape Size Error #6

RobeSantoro · 2024-05-13T15:56:46Z

Unfortunately, running any of the example workflows I get the following error:

Error occurred when executing AniPortrait_Pose_Gen_Video:

Error(s) in loading state_dict for CLIPVisionModelWithProjection:
size mismatch for vision_model.embeddings.class_embedding: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for vision_model.embeddings.patch_embedding.weight: copying a param with shape torch.Size([1024, 3, 14, 14]) from checkpoint, the shape in current model is torch.Size([768, 3, 32, 32]).
size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([257, 1024]) from checkpoint, the shape in current model is torch.Size([50, 768]).
size mismatch for vision_model.pre_layrnorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for vision_model.pre_layrnorm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).

[...]

You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

File "E:\COMFY\ComfyUI-robe\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\custom_nodes\ComfyUI_Aniportrait\nodes.py", line 169, in pose_generate_video
image_enc = CLIPVisionModelWithProjection.from_pretrained(image_encoder_path).to(dtype=weight_dtype, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxx\.conda\envs\comfy\Lib\site-packages\transformers\modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxx\.conda\envs\comfy\Lib\site-packages\transformers\modeling_utils.py", line 4155, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")

Is this related to my torch or transformers version?
I'm running transformers 4.40.2 and torch 2.3.0+cu118

Can you please help me to fix it?

The text was updated successfully, but these errors were encountered:

RobeSantoro · 2024-05-13T16:19:48Z

I can also add the output before the error:

got prompt
[rgthree] Using rgthree's optimized recursive execution.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
 ['conv_norm_out.weight, conv_norm_out.bias, conv_out.weight, conv_out.bias']
loaded temporal unet's pretrained weights from E:\COMFY\ComfyUI-robe\custom_nodes\ComfyUI_Aniportrait\pretrained_model\stable-diffusion-v1-5\unet ...
Load motion module params from E:\COMFY\ComfyUI-robe\custom_nodes\ComfyUI_Aniportrait\pretrained_model\motion_module.pth
Loaded 453.20928M-parameter motion module

and that I'm running diffusers 0.26.2

frankchieng · 2024-05-14T02:52:23Z

make sure the pretrained_model file is completed otherwise u'd better re-download these models,btw,pay attention to the reference image and video size, they should be all the square shape

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLIPVisionModelWithProjection Shape Size Error #6

CLIPVisionModelWithProjection Shape Size Error #6

RobeSantoro commented May 13, 2024

RobeSantoro commented May 13, 2024

frankchieng commented May 14, 2024

CLIPVisionModelWithProjection Shape Size Error #6

CLIPVisionModelWithProjection Shape Size Error #6

Comments

RobeSantoro commented May 13, 2024

RobeSantoro commented May 13, 2024

frankchieng commented May 14, 2024