-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine tune BLIP image retrieval for custom dataset without annotations #55
Comments
Hi, BLIP does not require bounding box input. You can try to use the entire image as input. |
Can you describe how would that work and how I should define the dataset for BLIP image retrieval fine tuning? |
You can define the dataset following the same format as COCO |
oh i get it so define my dataset in a JSON file with this format as defined in the coco_karpathy dataset in the following way
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hi i would like to ask hows should I approach fine-tuning BLIP for image retrieval,my dataset contains a caption and image pair with no bounding box annotations, is it possible to train BLIP without annotations or should I create a bounding box of width/height = image width/height for each image
The text was updated successfully, but these errors were encountered: