Weird caption for a picture of flower #99

phelogges · 2022-09-22T02:06:02Z

I got a weird caption on a picture of flower and don't know why:(
Hope for some advice

Model: model_base_capfilt_large.pth
sha256: 8f5187458d4d47bb87876faf3038d5947eff17475edf52cf47b62e84da0b235f

some core codes:

device = torch.device("cpu")

image_size = 224
image_path = "xxx" # say we read image by path
model = blip_decoder("checkpoints/model_base_capfilt_large.pth", image_size=image_size, vit="base")
model.eval()
model = model.to(device)

raw_image = Image.open(image_path).convert("RGB")
    transform = transforms.Compose([
        transforms.Resize((image_size, image_size), interpolation=InterpolationMode.BICUBIC),
        transforms.ToTensor(),
        transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711))
    ])
image = transform(raw_image).unsqueeze(0).to(device)

with torch.no_grad():
    # beam search
    t0 = time.time()
    caption = model.generate(image, sample=False, num_beams=3, max_length=20, min_length=5)[0]
    cost = time.time() - t0
print(caption)

Output: dai dai dai dai dai dai dai dai dai dai dai dai dai dai dai dai

Here's the input image

The text was updated successfully, but these errors were encountered:

phelogges · 2022-09-22T02:26:25Z

Update
Same code and model, test other two pictures of flower, similar to picture above

Pic1: a bunch of dai dai dai dai dai dai dai dai dai dai dai dai dai

Pic2: a bunch of yellow flowers

According to pic1 caption, model just thinks dai is a kind of flower, and these flowers also called daisy：）

LiJunnan1992 · 2022-09-23T01:06:55Z

Thanks for posting this interesting behavior from the model, this is new to me :)

phelogges · 2022-09-24T04:03:57Z

So any advice for improvment?
may be params in caption = model.generate(image, sample=False, num_beams=3, max_length=20, min_length=5)[0] will help

LiJunnan1992 · 2022-10-03T01:45:06Z

you may want to try the image captioning model finetuned on COCO

saffie91 · 2022-12-15T15:06:16Z

nucleus sampling also doesnt do this behaviour. The way I see it is that the beam search tries to fill the min length but gets stuck on the same thing when the picture is simple and there is not much else to say.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird caption for a picture of flower #99

Weird caption for a picture of flower #99

phelogges commented Sep 22, 2022 •

edited

Loading

phelogges commented Sep 22, 2022

LiJunnan1992 commented Sep 23, 2022

phelogges commented Sep 24, 2022

LiJunnan1992 commented Oct 3, 2022

saffie91 commented Dec 15, 2022

Weird caption for a picture of flower #99

Weird caption for a picture of flower #99

Comments

phelogges commented Sep 22, 2022 • edited Loading

phelogges commented Sep 22, 2022

LiJunnan1992 commented Sep 23, 2022

phelogges commented Sep 24, 2022

LiJunnan1992 commented Oct 3, 2022

saffie91 commented Dec 15, 2022

phelogges commented Sep 22, 2022 •

edited

Loading