Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Caption-Based Image Retrieval Model #5019

Open
dirkgr opened this issue Feb 24, 2021 · 0 comments
Open

Caption-Based Image Retrieval Model #5019

dirkgr opened this issue Feb 24, 2021 · 0 comments
Labels
Contributions welcome hard Difficult tasks Models Issues related to the allennlp-models repo

Comments

@dirkgr
Copy link
Member

dirkgr commented Feb 24, 2021

We want to implement the Caption-Based Image Retrieval task from https://api.semanticscholar.org/CorpusID:199453025.

The COCO and Flickr30k datasets contain a large number of images with image captions. The task here is to train a model to pick the right image given the caption. The image must be picked from four images, one of which is the real one, and the other three are other random images from the dataset.

You will have to write Steps that produce a DatasetDict for Flickr30k and COCO, including code that can produce the negative examples. Each instance will consist of a caption with four images. You will also need to write model that can solve this task. The underlying component for the model will be VilBERT, and the VQA model is probably a good place to steal some code getting started.

@dirkgr dirkgr added Contributions welcome Models Issues related to the allennlp-models repo GSoC hard Difficult tasks labels Feb 24, 2021
@dirkgr dirkgr added this to Not Started in Google Summer of Code Feb 26, 2021
@dirkgr dirkgr removed the GSoC label Mar 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Contributions welcome hard Difficult tasks Models Issues related to the allennlp-models repo
Projects
No open projects
Development

No branches or pull requests

1 participant