-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is there no model checkpoint that perform ITC+ITM+LM Loss on Coco/Flickr? #132
Comments
Hi, I'm curious about it too, do you know the reason now? |
I think BLIP-2 answers this question already -- BLIP-2 trained with 3 losses and got improved performance |
What does it mean? I couldn't understand. Does it mean BLIP was trained with three losses separately or trained together? I think it only used LM loss for fine tuning in image captioning, |
@linzhiqiu any progress on this? |
I am curious why BLIP does not choose to apply all 3 losses when finetuning on COCO/Flickr. I would have thought that using all 3 losses will produce a model that can simultaneously perform both retrieval and captioning (on COCO/Flickr). Let me know if this is an mis-understanding!
The text was updated successfully, but these errors were encountered: