v1.0
Readme
- Images for the training set are from COCO train2014 and val2014, available here.
- The VisDial evaluation server is hosted on EvalAI.
- Numbers (in papers, etc.) should be reported on v1.0 test-std, and NOT on v0.9.
- The Visual Dialog Challenge is conducted on v1.0 test-challenge.
- For both test-std and test-challenge, predictions must be submitted on the full test set.
- [NEW] Relevance scores from dense answer annotations on v1.0 val can be used to compute NDCG. Read more here.
Format
{
'data': {
'questions': [
'does it have a doorknob',
'do you see a fence around the bear',
...
],
'answers': [
'no, there is just green field in foreground',
'countryside house',
...
],
'dialogs': [
{
'image_id': <image id>,
'caption': <image caption>,
'dialog': [
{
'question': <index of question in `data.questions` list>,
'answer': <index of answer in `data.answers` list>,
'answer_options': <100 candidate answer indices from `data.answers`>,
'gt_index': <index of `answer` in `answer_options`>
},
... (10 rounds of dialog)
]
},
...
]
},
'split': <VisDial split>,
'version': '1.0'
}
Readme
- v0.9 Training is from COCO Training and v0.9 Validation set is from COCO Validation
- Numbers (in papers, etc.) should be reported on
v0.9 valv1.0 test-std
Format
{
'data': {
'questions': [
'does it have a doorknob',
'do you see a fence around the bear',
...
],
'answers': [
'no, there is just green field in foreground',
'countryside house',
...
],
'dialogs': [
{
'image_id': <COCO image id>,
'caption': <image caption from COCO>,
'dialog': [
{
'question': <index of question in `data.questions` list>,
'answer': <index of answer in `data.answers` list>,
'answer_options': <100 candidate answer indices from `data.answers`>,
'gt_index': <index of `answer` in `answer_options`>
},
... (10 rounds of dialog)
]
},
...
]
},
'split': <COCO split>,
'version': '0.9'
}
v0.5 (deprecated)
Training set (1.2G)
50,729 images
50,729 images
Validation set (168M)
7,663 images
7,663 images
Testing set (215M)
9,628 images
9,628 images
Readme
- v0.5 Training and Validation sets are from COCO Training and v0.5 Testing set is from COCO Validation
Format
[
{
'image_id': <COCO image id>,
'split': <COCO split>,
'caption': <image caption from COCO>,
'dialog': [
{
'question': '...',
'answer': '...',
'options': <100 candidate answers>,
'gt_index': <index of `answer` in `options`>
},
... (10 rounds of dialog)
]
},
...
]