This repository is an official implementation of the paper SOLQ: Segmenting Objects by Learning Queries.
TL; DR. SOLQ is an end-to-end instance segmentation framework with Transformer. It directly outputs the instance masks without any box dependency.
Abstract. In this paper, we propose an end-to-end framework for instance segmentation. Based on the recently introduced DETR, our method, termed SOLQ, segments objects by learning unified queries. In SOLQ, each query represents one object and has multiple representations: class, location and mask. The object queries learned perform classification, box regression and mask encoding simultaneously in an unified vector form. During training phase, the mask vectors encoded are supervised by the compression coding of raw spatial masks. In inference time, mask vectors produced can be directly transformed to spatial masks by the inverse process of compression coding. Experimental results show that SOLQ can achieve state-of-the-art performance, surpassing most of existing approaches. Moreover, the joint learning of unified query representation can greatly improve the detection performance of original DETR. We hope our SOLQ can serve as a strong baseline for the Transformer-based instance segmentation.
Method | Backbone | Dataset | Box AP | Mask AP | Model |
---|---|---|---|---|---|
SOLQ | R50 | test-dev | 47.8 | 39.7 | |
SOLQ | R101 | test-dev | 48.7 | 40.9 | |
SOLQ | Swin-L | test-dev | 55.4 | 45.9 |
The codebase is built on top of Deformable DETR.
-
Linux, CUDA>=9.2, GCC>=5.4
-
Python>=3.7
We recommend you to use Anaconda to create a conda environment:
conda create -n deformable_detr python=3.7 pip
Then, activate the environment:
conda activate deformable_detr
-
PyTorch>=1.5.1, torchvision>=0.6.1 (following instructions here)
For example, if your CUDA version is 9.2, you could install pytorch and torchvision as following:
conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=9.2 -c pytorch
-
Other requirements
pip install -r requirements.txt
-
Build MultiScaleDeformableAttention
cd ./models/ops sh ./make.sh
Please download COCO and organize them as following:
mkdir data && cd data
ln -s /path/to/coco coco
Training SOLQ on 8 GPUs as following:
sh configs/r50_solq_train.sh
You can download the pretrained model of SOLQ (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 val dataset:
sh configs/r50_solq_eval.sh
You can download the pretrained model of SOLQ (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 test-dev dataset (submit to server):
sh configs/r50_solq_submit.sh
If you find SOLQ useful in your research, please consider citing:
@article{dong2021solq,
title={SOLQ: Segmenting Objects by Learning Queries},
author={Bin Dong, Fangao Zeng, Tiancai Wang, Xiangyu Zhang, Yichen Wei},
journal={arXiv preprint arXiv:2106.02351},
year={2021}
}