Detectron2 Builtin Datasets

Detectron2 has builtin support for a few datasets.

The datasets are assumed to exist in a directory specified by the environment variable DETECTRON2_DATASETS.

Under this directory, following here to prepare COCO, LVIS, cityscapes, Pascal VOC and ADE20k.

The expected structure is described below.

$DETECTRON2_DATASETS/
  coco/
  lvis/
  cityscapes/
  VOC20{07,10,12}/
  ADEChallengeData2016/

You can set the location for builtin datasets by export DETECTRON2_DATASETS=/path/to/datasets. If left unset, the default is ./datasets relative to your current working directory.

APE Builtin Datasets

Expected dataset structure for COCO and LVIS

$DETECTRON2_DATASETS/
  coco/
    annotations/
      instances_{train,val}2017.json
      panoptic_{train,val}2017.json
    {train,val}2017/
    panoptic_{train,val}2017/
    panoptic_stuff_{train,val}2017/
    panoptic_semseg_{train,val}2017/
  lvis/
    lvis_v1_{train,val}.json
    lvis_v1_{train,val}+coco_mask.json
    lvis_v1_{train,val}+coco_mask_cat_info.json

panoptic_semseg_{train,val}2017/ are generated by runing

python3 datasets/prepare_coco_semantic_annos_from_panoptic_annos.py

lvis_v1_{train,val}+coco_mask.json are generated by running

python3 datasets/tools/lvis/merge_lvis_coco.py

lvis_v1_{train,val}+coco_mask_cat_info.json are generated by running

python3 datasets/tools/lvis/add_category_info_frequence.py --json_path datasets/lvis/lvis_v1_train+coco_mask.json
python3 datasets/tools/lvis/add_category_info_frequence.py --json_path datasets/lvis/lvis_v1_val+coco_mask.json

Expected dataset structure for Objects365:

$DETECTRON2_DATASETS/
  objects365/
    annotations/
      zhiyuan_objv2_{train,val}.json
      objects365_{train,val,minival}_fixname.json
    train/
      images/
    val/
      images/

objects365_train_fixname.json and objects365_val_fixname.json are generated by running

python3 datasets/tools/objects3652coco/get_image_info.py --image_dir datasets/objects365/train/ --json_path datasets/objects365/annotations/zhiyuan_objv2_train.json --output_path datasets/objects365/annotations/image_info_train.txt
python3 datasets/tools/objects3652coco/get_image_info.py --image_dir datasets/objects365/val/ --json_path datasets/objects365/annotations/zhiyuan_objv2_val.json --output_path datasets/objects365/annotations/image_info_val.txt

python3 datasets/tools/objects3652coco/convert_annotations.py --root_dir datasets/objects365/ --image_info_path datasets/objects365/annotations/image_info_train.txt --subsets train --apply_exif
python3 datasets/tools/objects3652coco/convert_annotations.py --root_dir datasets/objects365/ --image_info_path datasets/objects365/annotations/image_info_val.txt --subsets val --apply_exif
python3 datasets/tools/objects3652coco/convert_annotations.py --root_dir datasets/objects365/ --image_info_path datasets/objects365/annotations/image_info_val.txt --subsets minival --apply_exif

python3 datasets/tools/objects3652coco/fix_o365_names.py --ann datasets/objects365/annotations/objects365_train.json
python3 datasets/tools/objects3652coco/fix_o365_names.py --ann datasets/objects365/annotations/objects365_val.json
python3 datasets/tools/objects3652coco/fix_o365_names.py --ann datasets/objects365/annotations/objects365_minival.json

As Objects365 is large, we generate annotation file for each image separetely

python3 datasets/tools/generate_img_ann_pair.py --json_path datasets/objects365/annotations/objects365_train_fixname.json --image_root datasets/objects365/train/

Expected dataset structure for OpenImages:

$DETECTRON2_DATASETS/
  openimages/
    annotations/
      openimages_v6_{train,val}_bbox.json
      openimages_v6_{train,val}_bbox_nogroup.json
      openimages_v6_{train,val}_bbox_cat_info.json
      openimages_v6_{train,val}_bbox_nogroup_cat_info.json
    train/
    validation/

openimages_v6_{train,val}_bbox.json are generated by running

python3 datasets/tools/openimages2coco/convert_annotations.py --path datasets/openimages/ --version v6 --subset train --task bbox --apply-exif
python3 datasets/tools/openimages2coco/convert_annotations.py --path datasets/openimages/ --version v6 --subset val --task bbox --apply-exif

openimages_v6_{train,val}_bbox_nogroup.json are generated by running

python3 datasets/tools/openimages2coco/convert_annotations.py --path datasets/openimages/ --version v6 --subset train --task bbox --apply-exif --exclude-group
python3 datasets/tools/openimages2coco/convert_annotations.py --path datasets/openimages/ --version v6 --subset val --task bbox --apply-exif --exclude-group

*_cat_info.json are generated by running

python3 datasets/tools/lvis/add_category_info_frequence.py --json_path datasets/openimages/annotations/openimages_v6_train_bbox.json
python3 datasets/tools/lvis/add_category_info_frequence.py --json_path datasets/openimages/annotations/openimages_v6_val_bbox.json
python3 datasets/tools/lvis/add_category_info_frequence.py --json_path datasets/openimages/annotations/openimages_v6_train_bbox_nogroup.json
python3 datasets/tools/lvis/add_category_info_frequence.py --json_path datasets/openimages/annotations/openimages_v6_val_bbox_nogroup.json

Finally, runing

python3 datasets/tools/generate_img_ann_pair.py --json_path datasets/openimages/annotations/openimages_v6_train_bbox.json --image_root datasets/openimages/train/

Expected dataset structure for VisualGenome:

$DETECTRON2_DATASETS/
  visualgenome/
    annotations/
      visualgenome_77962_box.json
      visualgenome_77962_box_{train,val}.json
      visualgenome_region.json
      visualgenome_region_{train,val}.json
      visualgenome_77962_box_and_region.json
      visualgenome_77962_box_and_region__{train,val}.json
    VG_100K/
    VG_100K_2/

visualgenome_*.json are generated by running

python3 datasets/tools/visualgenome2coco/convert_annotations_object.py -p datasets/visualgenome/ --apply-exif --object_list "" --num_objects 99999999 --min_box_area_frac 0.0

python3 datasets/tools/visualgenome2coco/convert_annotations_region.py -p datasets/visualgenome/ --apply-exif --object_list "" --num_objects 99999999 --min_box_area_frac 0.0

Expected dataset structure for SA-1B:

$DETECTRON2_DATASETS/
  SA-1B/
    images/
    sam1b_instance_1000000.json
    ...
    sam1b_instance.json

sam1b_instance*.json are generated by running

python tools/sa1b2coco/image+json.py --image_root datasets/SA-1B/images/ --json_path datasets/SA-1B/sam1b_instance

Expected dataset structure for RefCOCO:

$DETECTRON2_DATASETS/
  SeqTR/
    mixed/
    refcocog-google/
        instances_cocofied_{train,val}.json
    refcocog-umd/
        instances_cocofied_{train,val,test}.json
    refcoco-unc/
        instances_cocofied_{train,val,testA,testB}.json
    refcocoplus-unc/
        instances_cocofied_{train,val,testA,testB}.json
    refcoco-mixed/
        instances_cocofied_train.json
    refcoco-mixed_group-by-image/
        instances_cocofied_train.json

Download the preprocessed json files from here

refcoco-mixed/ and some instances_cocofied_*.json are generated by running

python3 datasets/tools/seqtr2coco/convert_mix_ref.py

refcoco-mixed_group-by-image// and its instances_cocofied_train.json are generated by running

python3 datasets/tools/seqtr2coco/convert_refcoco_mixed_group_by_image.py

Expected dataset structure for GQA:

$DETECTRON2_DATASETS/
  gqa/
    images/
    gqa_region_{train,val}.json
    gqa_region.json

gqa_region*.json are generated by running

python3 datasets/tools/gqa2coco/convert.py --data_path datasets/gqa/ --img_path datasets/gqa/images --sg_path datasets/gqa/ --vg_img_data_path datasets/visualgenome/annotations/ --out_path datasets/gqa/

Expected dataset structure for PhraseCut:

$DETECTRON2_DATASETS/
  phrasecut/
    images/
    phrasecut_{train,val,miniv,test}.json

phrasecut_*.json are generated by running

python3 datasets/tools/phrasecut2coco/convert.py --data_path datasets/phrasecut/ --img_path datasets/phrasecut/images --out_path datasets/phrasecut/

Expected dataset structure for Flickr30k:

$DETECTRON2_DATASETS/
  flickr30k/
    flickr30k-images/
    flickr30k_separateGT_{train,val.test}.json

flickr30k_separateGT_*.json are generated by running

python3 datasets/tools/flickr2coco/convert.py --flickr_path datasets/flickr30k/flickr30k_entities/ --out_path datasets/flickr30k/

Expected dataset structure for ODinW:

$DETECTRON2_DATASETS/
  odinw/
    AerialMaritimeDrone/
    AmericanSignLanguageLetters/
    ...
    WildfireSmoke/

After download, update json files by runing

python3 datasets/tools/odinw/convert.py

This is because

https://github.com/cocodataset/cocoapi/issues/507#issuecomment-857272753

Expected dataset structure for SegInW:

$DETECTRON2_DATASETS/
  seginw/
    Airplane-Parts/
    Bottles/
    ...
    Watermelon/

Expected dataset structure for Roboflow100:

$DETECTRON2_DATASETS/
  rf100/
    4-fold-defect/
    abdomen-mri/
    ...
    x-ray-rheumatology/

Expected dataset structure for ADE20k-Full:

ADE20K_2021_17_01/
  images/
  images_detectron2/
  annotations_detectron2/
  index_ade20k.pkl
  objects.txt

The directories images_detectron2 and annotations_detectron2 are generated by running

python datasets/prepare_ade20k_full_sem_seg.py

Expected dataset structure for BDD10k:

$DETECTRON2_DATASETS/
  bdd100k/
    images/
    labels/
      pan_seg/
        coco_pano/
        meta/
        ...
      ...
    seg/

coco_pano and meta is generated by running

wget https://github.com/shenyunhang/APE/releases/download/0/bdd_generated.tar.gz
tar xvzf bdd_generated.tar.gz

Expected dataset structure for PC459 and PC59:

$DETECTRON2_DATASETS/
  VOCdevkit/
    VOC2010/
      Annotations/
      ImageSets/
      JPEGImages/
      SegmentationClass/
      SegmentationObject/
      # below are from https://www.cs.stanford.edu/~roozbeh/pascal-context/trainval.tar.gz
      trainval/
      labels.txt
      59_labels.txt # https://www.cs.stanford.edu/~roozbeh/pascal-context/59_labels.txt
      pascalcontext_val.txt # https://drive.google.com/file/d/1BCbiOKtLvozjVnlTJX51koIveUZHCcUh/view?usp=sharing
      # below are generated
      annotations_detectron2/
        pc459_val/
        pc59_val

It starts with a tar file VOCtrainval_03-May-2010.tar. You may want to download the 5K validation set here.

The directory annotations_detectron2 is generated by running

python datasets/prepare_pascal_context.py

Expected dataset structure for VOC:

$DETECTRON2_DATASETS/
  VOCdevkit/
    VOC2012/
      Annotations/
      ImageSets/
      JPEGImages/
      SegmentationClass/
      SegmentationObject/
      SegmentationClassAug/ # https://github.com/kazuto1011/deeplab-pytorch/blob/master/data/datasets/voc12/README.md
      # below are generated
      images_detectron2/
      annotations_detectron2/
        val/

It starts with a tar file VOCtrainval_11-May-2012.tar.

We use SBD augmentated training data as SegmentationClassAug following Deeplab

The directories images_detectron2 and annotations_detectron2 are generated by running

python datasets/prepare_voc_sem_seg.py

Expected dataset structure for D3:

$DETECTRON2_DATASETS/
  D3/
    d3_images/
    d3_json/
    d3_pkl/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Detectron2 Builtin Datasets

APE Builtin Datasets

Expected dataset structure for COCO and LVIS

Expected dataset structure for Objects365:

Expected dataset structure for OpenImages:

Expected dataset structure for VisualGenome:

Expected dataset structure for SA-1B:

Expected dataset structure for RefCOCO:

Expected dataset structure for GQA:

Expected dataset structure for PhraseCut:

Expected dataset structure for Flickr30k:

Expected dataset structure for ODinW:

Expected dataset structure for SegInW:

Expected dataset structure for Roboflow100:

Expected dataset structure for ADE20k-Full:

Expected dataset structure for BDD10k:

Expected dataset structure for PC459 and PC59:

Expected dataset structure for VOC:

Expected dataset structure for D3:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Detectron2 Builtin Datasets

APE Builtin Datasets

Expected dataset structure for COCO and LVIS

Expected dataset structure for Objects365:

Expected dataset structure for OpenImages:

Expected dataset structure for VisualGenome:

Expected dataset structure for SA-1B:

Expected dataset structure for RefCOCO:

Expected dataset structure for GQA:

Expected dataset structure for PhraseCut:

Expected dataset structure for Flickr30k:

Expected dataset structure for ODinW:

Expected dataset structure for SegInW:

Expected dataset structure for Roboflow100:

Expected dataset structure for ADE20k-Full:

Expected dataset structure for BDD10k:

Expected dataset structure for PC459 and PC59:

Expected dataset structure for VOC:

Expected dataset structure for D3: