Skip to content

A Siamese self-supervised pretraining approach for the Transformer architecture in DETR

License

Notifications You must be signed in to change notification settings

Zx55/SiameseDETR

Repository files navigation

Siamese DETR

By Zeren Chen, Gengshi Huang, Wei Li, Jianing Teng, Kun Wang, Jing Shao, Chen Change Loy, Lu Sheng.

This repository is an official implementation of the paper Siamese DETR.

Introduction

We propose Siamese DETR, a novel self-supervised pretraining method for DETR. With two newly-designed pretext tasks, we directly locate the query regions generated by Edgeboxes in a cross-view manner and maximize cross-view semantic consistency, learning localization and discrimination representation transfer to downstream detection tasks.

Preliminary

1. Enviroment Requirements

This code is implemented based on MMSelfSup codebase. We test the code with python 3.6, pytorch 1.8.1, CUDA 9.0 and CUDNN 7.6.5. Other requirements can be installed via:

pip install -r requirements/runtime.txt --user

2. Deformable Attention

Compile wheel following Deformable DETR repo.

3. Offline EdgeBoxes

You can generate offline EdgeBoxes via:

cd tools/hycplayers
pip install -r requirements.txt --user

sh run.sh <YOUR_PARTITION> <NUM_PROCESS> <PATH/TO/DATA/ROOT> <PATH/TO/DATA/META> <PATH/TO/SAVE/DIR> <DATASET>  # <DATASET> should be imagenet or coco 

4. SwAV Pretrained Backbone

Siamese DETR use a frozen SwAV backbone to extract feature. You can download SwAV pretrained backbone ([email protected] top1) from SwAV repo.

5. Create Soft Link

We provide coco-style PASCAL VOC meta files (See data/datasets/voc_meta) for downstream finetuning.

Put them into voc directory.

mkdir -p data/datasets/edgebox

ln -s path/to/swav/backbone data/model_zoo/resnet/swav_800ep_pretrain_oss.pth.tar

# dataset
ln -s path/to/coco2017 data/datasets/mscoco2017
ln -s path/to/imagenet data/datasets/imagenet
ln -s path/to/voc data/datasets/voc
cp -rf data/datasets/voc_meta data/datasets/voc/meta

# edgebox
ln -s path/to/imagenet/edgebox data/datasets/edgebox/imagenet
ln -s path/to/coco/edgebox data/datasets/edgebox/coco

6. Downstream Preparation

We provide an example for downstream finetining, i.e., Conditional DETR in downstream_finetune/conditionaldetr.

The primary modifications compared to original repo includes:

  • Add --pretrain arguments to load Siamese DETR pretrained checkpoint. (See main.py L160-168)

  • Add PASCAL VOC datasets. (See voc.py)

Usage

# upstream pretraining on ImageNet/COCO
# You can check work_dirs/selfsup/siamese_detr/<cfgs>/<time>.log.json for training details.
sh tools/srun_train.sh <PARTITION> <CONFIG> <NUM_GPU> <JOB_NAME>  

# convert openselfsup (OSS) checkpoint to detr checkpoint
python tools/convert_to_detr.py --ckpt <OSS_CKPT> --export <SAVE_DIR> [--deform]

# downstream finetune
cd downstream_finetune/<DETR_VARIANTS_DIR>
sh [train_coco.sh|train_voc.sh] <PARTITION> <CONVERTED_CKPT> <NUM_GPU> <JOB_NAME>

Model Zoo

We provide pretrained checkpoints here.

Transfer results on ImgNet -> COCO (We report AP in downstream benchmark)

Method Vanilla ConditionDETR-100q DeformableDETR-MS-300q
from scratch 39.7 37.7 45.5
UP-DETR 40.5 39.4 45.3
DETReg 41.9 40.2 45.5
SiameseDETR 42.0 40.5 46.4

Transfer results on ImgNet -> PASCAL VOC

Method Vanilla ConditionDETR-100q DeformableDETR-MS-300q
from scratch 47.8 49.9 56.1
UP-DETR 54.4 56.9 56.4
DETReg 57.0 57.5 59.7
SiameseDETR 57.4 58.1 61.2

Transfer results on COCO -> PASCAL VOC

Method ConditionDETR-100q
from scratch 49.9
UP-DETR 51.3
DETReg 55.9
SiameseDETR 57.7

License

Siamese DETR is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Citation

@article{chen2023siamese,
  title={Siamese DETR},
  author={Chen, Zeren and Huang, Gengshi and Li, Wei and Teng, Jianing and Wang, Kun and Shao, Jing and Loy, Chen Change and Sheng, Lu},
  journal={arXiv preprint arXiv:2303.18144},
  year={2023}
}

About

A Siamese self-supervised pretraining approach for the Transformer architecture in DETR

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages