Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0 #165

Open
Peter-D-James opened this issue Jul 7, 2023 · 18 comments

Comments

@Peter-D-James
Copy link

image
When I use beam search to generate caption for a picture on Colab, this error occurs. My transformers version is 4.25.1. And it works when I use nucleus sampling. How should I solve this

@jding25
Copy link

jding25 commented Jul 9, 2023

Screen Shot 2023-07-09 at 9 38 26 AM I commented out the line that causes error and uncommented the line below as a temporary solution

@Peter-D-James
Copy link
Author

Screen Shot 2023-07-09 at 9 38 26 AM I commented out the line that causes error and uncommented the line below as a temporary solution

Yeah, that's what I did. But it's strange that I can use these two methods in the official demo Colab notebook, but I can't use beam search when I write the code myself.

@Peter-D-James
Copy link
Author

Screen Shot 2023-07-09 at 9 38 26 AM I commented out the line that causes error and uncommented the line below as a temporary solution

And I succeeded after downgrading the transformers version to 4.16.0. But it seems that I cannot import AutoProcessor when using this version of transformers.

@HWH-2000
Copy link

same issue

@csf0429
Copy link

csf0429 commented Jul 23, 2023

same issue, waiting for fixing

@Saint-lsy
Copy link

same issue

@bmaltais
Copy link

I suggest opening an issue with kohya_ss in his repo as he is the one maintaining the behind the scene code. I only wrap his script in a GUI: https://github.com/kohya-ss/sd-scripts

@shams2023
Copy link

image When I use beam search to generate caption for a picture on Colab, this error occurs. My transformers version is 4.25.1. And it works when I use nucleus sampling. How should I solve this

Dude, I also ran into this problem when I changed sample = False to sample = True like this:
image

This will allow for successful execution. But I wonder why beam search doesn't work and only nuclear search can be used

@shams2023
Copy link

image When I use beam search to generate caption for a picture on Colab, this error occurs. My transformers version is 4.25.1. And it works when I use nucleus sampling. How should I solve this

Dude, I also ran into this problem when I changed sample = False to sample = True like this: image

This will allow for successful execution. But I wonder why beam search doesn't work and only nuclear search can be used
image

@hannahgym
Copy link

hannahgym commented Nov 20, 2023

I have same problem and using transformers=4.16 did not help

@yenlianglai
Copy link

yenlianglai commented Dec 3, 2023

I found that when commented out the line in /model/blip.py line 131 fix the problem:

image

Don't know why, hope someone can provide the detail explanation down the hood.

@LWShowTime
Copy link

I found that when commented out the line in /model/blip.py line 131 fix the problem:

image

Don't know why, hope someone can provide the detail explanation down the hood.

When you comment out this line, the dimension 9 will be 3, so it can run. But this is not the beam search since you just keep one result!

@mandalinadagi
Copy link

in order to solve this problem you need to set num_beams=1 not 3. (for instance in blip_vqa.py line 92)

@LWShowTime
Copy link

I solved this problem, if the transformers is 4.16.0, everything is ok. But I used the transformers4.36.2, in this case, in the 818 lines of the generation_utils.py of transformers:
you need to comment out the _expand_dict_for_generation function where the encoder_hidden_state was multiplied again with the beam search num!
Finally solved!!!

@bmaltais
Copy link

bmaltais commented Mar 5, 2024

You should submit a PR to Koby’s in his sd-scripts repo to fix it for good.

@amztc34283
Copy link

Updating to 1.0.2 fixed it for me.

@kohya-ss
Copy link

kohya-ss commented Mar 30, 2024

Commenting out these two lines may work:

BLIP/models/blip.py

Lines 131 to 132 in 3a29b74

if not sample:
image_embeds = image_embeds.repeat_interleave(num_beams,dim=0)

EDIT: After commenting I noticed yenlianglai had already written.

The recent transformers seems to do repeat_interleave automatically in _expand_dict_for_generation. This fix huggingface/transformers#21624 seems to cause this issue.

@Ashh-Z
Copy link

Ashh-Z commented May 23, 2024

I used Transformers==4.17 and did not face further issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests