Why is there no model checkpoint that perform ITC+ITM+LM Loss on Coco/Flickr? #132

linzhiqiu · 2023-02-28T04:53:09Z

I am curious why BLIP does not choose to apply all 3 losses when finetuning on COCO/Flickr. I would have thought that using all 3 losses will produce a model that can simultaneously perform both retrieval and captioning (on COCO/Flickr). Let me know if this is an mis-understanding!

Pppapaya · 2023-04-14T13:03:03Z

I am curious why BLIP does not choose to apply all 3 losses when finetuning on COCO/Flickr. I would have thought that using all 3 losses will produce a model that can simultaneously perform both retrieval and captioning (on COCO/Flickr). Let me know if this is an mis-understanding!

Hi, I'm curious about it too, do you know the reason now?

linzhiqiu · 2023-04-15T05:29:25Z

I think BLIP-2 answers this question already -- BLIP-2 trained with 3 losses and got improved performance

gwyong · 2023-05-26T17:36:29Z

What does it mean? I couldn't understand. Does it mean BLIP was trained with three losses separately or trained together? I think it only used LM loss for fine tuning in image captioning,

BrianG13 · 2023-08-07T20:55:23Z

@linzhiqiu any progress on this?

linzhiqiu closed this as completed Apr 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is there no model checkpoint that perform ITC+ITM+LM Loss on Coco/Flickr? #132

Why is there no model checkpoint that perform ITC+ITM+LM Loss on Coco/Flickr? #132

linzhiqiu commented Feb 28, 2023

Pppapaya commented Apr 14, 2023

linzhiqiu commented Apr 15, 2023

gwyong commented May 26, 2023

BrianG13 commented Aug 7, 2023

Why is there no model checkpoint that perform ITC+ITM+LM Loss on Coco/Flickr? #132

Why is there no model checkpoint that perform ITC+ITM+LM Loss on Coco/Flickr? #132

Comments

linzhiqiu commented Feb 28, 2023

Pppapaya commented Apr 14, 2023

linzhiqiu commented Apr 15, 2023

gwyong commented May 26, 2023

BrianG13 commented Aug 7, 2023