add support for num_return_sequences to BLIP_Decoder.generate ? #64

labenz · 2022-06-12T17:32:46Z

Congrats on creating such an effective image captioner! I'm using it in a project, and noticing that it's popping up in quite a few other places as well.

Inspired in part by this paper – https://arxiv.org/pdf/2205.10747.pdf – I was wondering if it would be possible to generate multiple candidate captions instead of just one?

It looks like it is possible, at least using the nucleus sampling method, but the value of num_return_sequences is hard-coded = 1 in the Blip_Decoder generate method – see here: https://github.com/salesforce/BLIP/blob/main/models/blip.py#L149

Would it be possible to add num_return_sequences to the generate method arguments and thus get multiple candidate captions?

Thank you!

LiJunnan1992 · 2022-06-13T09:19:51Z

Thanks for the suggestion. It is definitely possible to generate multiple sentences. We do not plan to change the code now in case it break other codes, but feel free to clone the code and change num_return_sequences for yourself.

labenz · 2022-06-13T16:36:15Z

Thank you! It seems this only works for nucleus sampling method – is that right? Thanks again. :)

LiJunnan1992 · 2022-06-14T01:02:29Z

You can also return multiple sentences with beam search, please refer to the generate function description here: https://huggingface.co/docs/transformers/main/en/main_classes/text_generation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for num_return_sequences to BLIP_Decoder.generate ? #64

add support for num_return_sequences to BLIP_Decoder.generate ? #64

labenz commented Jun 12, 2022 •

edited

Loading

LiJunnan1992 commented Jun 13, 2022

labenz commented Jun 13, 2022

LiJunnan1992 commented Jun 14, 2022

add support for num_return_sequences to BLIP_Decoder.generate ? #64

add support for num_return_sequences to BLIP_Decoder.generate ? #64

Comments

labenz commented Jun 12, 2022 • edited Loading

LiJunnan1992 commented Jun 13, 2022

labenz commented Jun 13, 2022

LiJunnan1992 commented Jun 14, 2022

labenz commented Jun 12, 2022 •

edited

Loading