Fine tune BLIP image retrieval for custom dataset without annotations #55

poipiii · 2022-05-10T10:32:59Z

hi i would like to ask hows should I approach fine-tuning BLIP for image retrieval,my dataset contains a caption and image pair with no bounding box annotations, is it possible to train BLIP without annotations or should I create a bounding box of width/height = image width/height for each image

LiJunnan1992 · 2022-05-10T10:45:30Z

Hi, BLIP does not require bounding box input. You can try to use the entire image as input.

poipiii · 2022-05-10T13:33:52Z

Can you describe how would that work and how I should define the dataset for BLIP image retrieval fine tuning?

LiJunnan1992 · 2022-05-11T01:21:48Z

You can define the dataset following the same format as COCO

poipiii · 2022-05-11T02:25:07Z

oh i get it so define my dataset in a JSON file with this format as defined in the coco_karpathy dataset in the following way

{
caption:"example caption for image",
image:"001.png",
image_id:"001"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine tune BLIP image retrieval for custom dataset without annotations #55

Fine tune BLIP image retrieval for custom dataset without annotations #55

poipiii commented May 10, 2022

LiJunnan1992 commented May 10, 2022

poipiii commented May 10, 2022

LiJunnan1992 commented May 11, 2022

poipiii commented May 11, 2022

Fine tune BLIP image retrieval for custom dataset without annotations #55

Fine tune BLIP image retrieval for custom dataset without annotations #55

Comments

poipiii commented May 10, 2022

LiJunnan1992 commented May 10, 2022

poipiii commented May 10, 2022

LiJunnan1992 commented May 11, 2022

poipiii commented May 11, 2022