DETReg: Unsupervised Pretraining with Region Priors for Object Detection

Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

This repository is the implementation of DETReg, see Project Page.

Introduction

DETReg is an unsupervised pretraining approach for object DEtection with TRansformers using Region priors. Motivated by the two tasks underlying object detection: localization and categorization, we combine two complementary signals for self-supervision. For an object localization signal, we use pseudo ground truth object bounding boxes from an off-the-shelf unsupervised region proposal method, Selective Search, which does not require training data and can detect objects at a high recall rate and very low precision. The categorization signal comes from an object embedding loss that encourages invariant object representations, from which the object category can be inferred. We show how to combine these two signals to train the Deformable DETR detection architecture from large amounts of unlabeled data. DETReg improves the performance over competitive baselines and previous self-supervised methods on standard benchmarks like MS COCO and PASCAL VOC. DETReg also outperforms previous supervised and unsupervised baseline approaches on low-data regime when trained with only 1%, 2%, 5%, and 10% of the labeled data on MS COCO.

Demo

Interact with the DETReg pretrained model in a Google Colab!

Installation

Requirements

Linux, CUDA>=9.2, GCC>=5.4
Python>=3.7

We recommend you to use Anaconda to create a conda environment:
```
conda create -n detreg python=3.7 pip
```
Then, activate the environment:
```
conda activate detreg
```
Installation: (change cudatoolkit to your cuda version. For detailed pytorch installation instructions click here)
```
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
```
Other requirements
```
pip install -r requirements.txt
```

Compiling CUDA operators

cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

Usage

Dataset preparation

ImageNet/ImageNet100

Download ImageNet and organize it in the following structure:

code_root/
└── data/
    └── ilsvrc/
          ├── train/
          └── val/

Note that in this work we also used the ImageNet100 dataset, which is x10 smaller than ImageNet. To create ImageNet100 run the following command:

mkdir -p data/ilsvrc100/train
mkdir -p data/ilsvrc100/val
while read line; do ln -s <code_root>/data/ilsvrc/train/$line <code_root>/data/ilsvrc100/train/$line; done < <code_root>/datasets/category.txt
while read line; do ln -s <code_root>/data/ilsvrc/val/$line <code_root>/data/ilsvrc100/val/$line; done < <code_root>/datasets/category.txt

This should results with the following structure:

code_root/
└── data/
    ├── ilsvrc/
          ├── train/
          └── val/
    └── ilsvrc100/
          ├── train/
          └── val/

MSCoco

Please download COCO 2017 dataset and organize it in the following structure:

code_root/
└── data/
    └── MSCoco/
        ├── train2017/
        ├── val2017/
        └── annotations/
        	├── instances_train2017.json
        	└── instances_val2017.json

Pascal VOC

Download Pascal VOC dataset (2012trainval, 2007trainval, and 2007test):

cd data/pascal
wget http:https://host.robots.ox.ac.uk:8080/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http:https://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http:https://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
tar -xvf *

The files should be organized in the following structure:

code_root/
└── data/
    └── pascal/
        └── VOCdevkit/
        	├── VOC2007
        	└── VOC2012

Create ImageNet Selective Search boxes:

Note: if you do not follow the following steps to create the boxes cache, this will happen on the run and slow training.

Download the precomputed ImageNet boxes and extract in the cache folder:

mkdir -p <code_root>/cache/ilsvrc && cd <code_root>/cache/ilsvrc 
wget https://github.com/amirbar/DETReg/releases/download/1.0.0/ss_box_cache.tar.gz
tar -xf ss_box_cache.tar.gz

Alternatively, you can compute Selective Search boxes yourself:

To create selective search boxes for ImageNet100 on a single machine, run the following command (set num_processes):

python -m datasets.cache_ss --dataset imagenet100 --part 0 --num_m 1 --num_p <num_processes_to_use>

To speed up the creation of boxes, change the arguments accordingly and run the following command on each different machine:

python -m datasets.cache_ss --dataset imagenet100 --part <machine_number> --num_m <num_machines> --num_p <num_processes_to_use>

The cached boxes are saved in the following structure:

code_root/
└── cache/
    └── ilsvrc/

Pretraining on ImageNet

The command for pretraining DETReg on 8 GPUs on ImageNet100 is as following:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_top30_in100.sh --batch_size 24 --num_workers 8

Training takes around 1.5 days with 8 NVIDIA V100 GPUs, you can download a pretrained model (see below) if you want to skip this step.

After pretraining, a checkpoint is saved in exps/DETReg_top30_in100/checkpoint.pth. To fine tune it over different coco settings use the following commands:

Finetuning on MSCoco

Fine tuning on full COCO (should take 2 days with 8 NVIDIA V100 GPUs):

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_full_coco.sh

For smaller subsets which trains faster, you can use smaller number of gpus (e.g 4 with batch size 2)/ Fine tuning on 1%

GPUS_PER_NODE=4 ./tools/run_dist_launch.sh 4 ./configs/DETReg_fine_tune_1pct_coco.sh --batch_size 2

Fine tuning on 2%

GPUS_PER_NODE=4 ./tools/run_dist_launch.sh 4 ./configs/DETReg_fine_tune_2pct_coco.sh --batch_size 2

Fine tuning on 5%

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_5pct_coco.sh --batch_size 1

Fine tuning on 10%

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_10pct_coco.sh --batch_size 1

Finetuning on Pascal VOC

Fine tune on full Pascal:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_full_pascal.sh --batch_size 4 --epochs 100 --lr_drop 70

Fine tune on 10% of Pascal:

GPUS_PER_NODE=2 ./tools/run_dist_launch.sh 2 ./configs/DETReg_fine_tune_10pct_pascal.sh --batch_size 4 --epochs 200 --lr_drop 150

Evaluation

To evaluate a finetuned model, use the following command from the project basedir:

./configs/<config file>.sh --resume exps/<config file>/checkpoint.pth --eval

Pretrained Models

Citation

If you found this code helpful, feel free to cite our work:

@misc{bar2021detreg,
      title={DETReg: Unsupervised Pretraining with Region Priors for Object Detection},
      author={Amir Bar and Xin Wang and Vadim Kantorov and Colorado J Reed and Roei Herzig and Gal Chechik and Anna Rohrbach and Trevor Darrell and Amir Globerson},
      year={2021},
      eprint={2106.04550},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Related Works

If you found DETReg useful, consider checking out these related works as well: ReSim, SwAV, DETR, UP-DETR, and Deformable DETR.

Acknowlegments

DETReg builds on previous works code base such as Deformable DETR and UP-DETR. If you found DETReg useful please consider citing these works as well.

License

DETReg is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
configs		configs
datasets		datasets
docs		docs
figs		figs
models		models
tools		tools
util		util
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
engine.py		engine.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DETReg: Unsupervised Pretraining with Region Priors for Object Detection

Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

Introduction

Demo

Installation

Requirements

Compiling CUDA operators

Usage

Dataset preparation

ImageNet/ImageNet100

MSCoco

Pascal VOC

Create ImageNet Selective Search boxes:

Alternatively, you can compute Selective Search boxes yourself:

Pretraining on ImageNet

Finetuning on MSCoco

Finetuning on Pascal VOC

Evaluation

Pretrained Models

Citation

Related Works

Acknowlegments

License

About

Releases

Packages

Languages

License

hanoonaR/DETReg

Folders and files

Latest commit

History

Repository files navigation

DETReg: Unsupervised Pretraining with Region Priors for Object Detection

Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

Introduction

Demo

Installation

Requirements

Compiling CUDA operators

Usage

Dataset preparation

ImageNet/ImageNet100

MSCoco

Pascal VOC

Create ImageNet Selective Search boxes:

Alternatively, you can compute Selective Search boxes yourself:

Pretraining on ImageNet

Finetuning on MSCoco

Finetuning on Pascal VOC

Evaluation

Pretrained Models

Citation

Related Works

Acknowlegments

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages