Batch predictions Image Captioning task #58

MikeMACintosh · 2022-05-18T08:45:56Z

Hi, glad to see and use this cool project, thanks you.
I have a question: if it possible to batch predictions on Image captioning task?
I see #48 but it's not my case.

i do something like:

base_model_path = 'path_to_base_model'
model_base = blip_decoder(pretrained=base_model_path, vit='base', image_size=IMAGE_SIZE)
model_base.eval()
model_base.to(device)

img = transform(sample).unsqueeze(0).to(device)
with torch.no_grad():
caption_bs_base=model_base.generate(img, sample=False, num_beams=7, max_length=16, min_length=5)

It works good, but i want to inference 4 models(vit base/large and beam search/nucleus sampling) and it's to long. On my server signature 12 pictures 4 models takes ~34 sec (12*4 = 48 signature).

Thanks you.

LiJunnan1992 · 2022-05-23T02:17:58Z

Yes you can do batch inference.

MikeMACintosh · 2022-05-23T08:51:14Z

@LiJunnan1992 Сould you explain how i can do that? Should I write my own Dataloader?

poipiii · 2022-05-26T05:45:34Z

yes you have to write your own data loader I just done it myself

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch predictions Image Captioning task #58

Batch predictions Image Captioning task #58

MikeMACintosh commented May 18, 2022 •

edited

Loading

LiJunnan1992 commented May 23, 2022

MikeMACintosh commented May 23, 2022

poipiii commented May 26, 2022

Batch predictions Image Captioning task #58

Batch predictions Image Captioning task #58

Comments

MikeMACintosh commented May 18, 2022 • edited Loading

LiJunnan1992 commented May 23, 2022

MikeMACintosh commented May 23, 2022

poipiii commented May 26, 2022

MikeMACintosh commented May 18, 2022 •

edited

Loading