Skip to content

yiranvang/TransCenter_official

Repository files navigation

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking
Yihong Xu, Yutong Ban, Guillaume Delorme, Chuang Gan, Daniela Rus, Xavier Alameda-Pineda
[Paper] [Project]

Bibtex

If you find this code useful, please star the project and consider citing:

@misc{xu2021transcenter,
      title={TransCenter: Transformers with Dense Queries for Multiple-Object Tracking}, 
      author={Yihong Xu and Yutong Ban and Guillaume Delorme and Chuang Gan and Daniela Rus and Xavier Alameda-Pineda},
      year={2021},
      eprint={2103.15145},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Environment Preparation

Option 1 (recommended):

We provide two singularity images (similar to docker) containing all the packages we need for TransCenter:

  1. Install singularity > 3.7.1: https://sylabs.io/guides/3.0/user-guide/installation.html#install-on-linux
  2. Download one of the singularity images:

pytorch1-5cuda10-1.sif tested with Nvidia GTX TITAN. Or

pytorch1-5cuda10-1_RTX.sif tested with Nvidia RTX TITAN, Quadro RTX 8000, RTX 2080Ti, Quadro RTX 4000.

  • Launch a Singularity image
singularity shell --nv --bind yourLocalPath:yourPathInsideImage YourSingularityImage.sif

- -bind: to link a singularity path with a local path. By doing this, you can find data from local PC inside Singularity image;
- -nv: use the local Nvidia driver.

Option 2:

You can also build your own environment:

  1. we use anaconda to simplify the package installations, you can download anaconda (4.9.2) here: https://www.anaconda.com/products/individual
  2. you can create your conda env by doing
conda create --name <YourEnvName> --file requirements.txt
  1. TransCenter uses Deformable transformer from Deformable DETR. Therefore, we need to install deformable attention modules:
cd ./to_install/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py
  1. TransCenter uses pytorch-liteflownet during tracking, which depends on correlation_package. You can install it by doing:
cd ./to_install/correlation_package
python setup.py install
  1. for the up-scale and merge module in TransCenter, we use deformable convolution module, you can install it with:
cd ./to_install/DCNv2
./make.sh         # build
python testcpu.py    # run examples and gradient check on cpu
python testcuda.py   # run examples and gradient check on gpu

see also known issues from https://github.com/CharlesShang/DCNv2. If you have issues related to cuda of the third-party modules, please try to recompile them in the GPU that you use for training and testing. The dependencies are compatible with Pytorch 1.5, cuda 10.2.

Data Preparation

ms coco: we use only the person category for pretraining TransCenter. The code for filtering is provided in ./data/coco_person.py.

@inproceedings{lin2014microsoft,
  title={Microsoft coco: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={European conference on computer vision},
  pages={740--755},
  year={2014},
  organization={Springer}
}

CrowdHuman: CrowdHuman labels are converted to coco format, the conversion can be done through ./data/convert_crowdhuman_to_coco.py.

@article{shao2018crowdhuman,
    title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
    author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
    journal={arXiv preprint arXiv:1805.00123},
    year={2018}
  }

MOT17: MOT17 labels are converted to coco format, the conversion can be done through ./data/convert_mot_to_coco.py.

@article{milan2016mot16,
  title={MOT16: A benchmark for multi-object tracking},
  author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
  journal={arXiv preprint arXiv:1603.00831},
  year={2016}
}

MOT20: MOT20 labels are converted to coco format, the conversion can be done through ./data/convert_mot20_to_coco.py.

@article{dendorfer2020mot20,
  title={Mot20: A benchmark for multi object tracking in crowded scenes},
  author={Dendorfer, Patrick and Rezatofighi, Hamid and Milan, Anton and Shi, Javen and Cremers, Daniel and Reid, Ian and Roth, Stefan and Schindler, Konrad and Leal-Taix{\'e}, Laura},
  journal={arXiv preprint arXiv:2003.09003},
  year={2020}
}

We also provide the filtered/converted labels:

ms coco person labels: please put the annotations folder inside cocoperson to your ms coco dataset root folder.

CrowdHuman coco format labels: please put the annotations folder inside crowdhuman to your CrowdHuman dataset root folder.

MOT17 coco format labels: please put the annotations and annotations_onlySDP folders inside MOT17 to your MOT17 dataset root folder.

MOT20 coco format labels: please put the annotations folder inside MOT20 to your MOT20 dataset root folder.

Model Zoo

deformable transformer pretrained: pretrained model from deformable-DETR.

coco_pretrained: model trained with coco person dataset.

CH_pretrained: model pretrained on coco person and fine-tuned on CrowdHuman dataset.

MOT17_fromCoCo: model pretrained on coco person and fine-tuned on MOT17 trainset.

MOT17_fromCH: model pretrained on CrowdHuman and fine-tuned on MOT17 trainset.

MOT20_fromCoCo: model pretrained on coco person and fine-tuned on MOT20 trainset.

MOT20_fromCH: model pretrained on CrowdHuman and fine-tuned on MOT20 trainset.

Please put all the pretrained models to ./model_zoo .

Training

  • Pretrained on coco person dataset:
cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=4 --use_env ./training/transcenter/main_coco_tracking.py --output_dir=./output/whole_coco --batch_size=4 --num_workers=20 --resume=./model_zoo/r50_deformable_detr-checkpoint.pth --pre_hm --tracking --data_dir=PathToCoCoDataset
  • Pretrained on CrowdHuman dataset:
cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=4 --use_env ./training/transcenter/main_crowdHuman_tracking.py --output_dir=./output/whole_ch_from_COCO --batch_size=4 --num_workers=20 --resume=./model_zoo/coco_pretrained.pth --pre_hm --tracking --data_dir=PathToCrowdHumanDataset
  • Train MOT17 from CoCo pretrained model:
cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot17_tracking.py --output_dir=./output/whole_MOT17_from_COCO --batch_size=2 --num_workers=20 --resume=./model_zoo/coco_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --data_dir=PathToMOT17dataset
  • Train MOT17 from CrowdHuman pretrained model:
cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot17_tracking.py --output_dir=./output/whole_MOT17_from_CH --batch_size=2 --num_workers=20 --resume=./model_zoo/CH_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --data_dir=PathToMOT17dataset
  • Train MOT20 from CoCo pretrained model:
cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot20_tracking.py --output_dir=./output/whole_MOT20_from_COCO --batch_size=2 --num_workers=20 --resume=./model_zoo/coco_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --not_max_crop --data_dir=PathToMOT20dataset
  • Train MOT20 from CrowdHuman pretrained model:
cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot20_tracking.py --output_dir=./output/whole_MOT20_from_CH --batch_size=2 --num_workers=20 --resume=./model_zoo/CH_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --not_max_crop --data_dir=PathToMOT20dataset

Tips:

  1. If you encounter RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR in some GPUs, please try to set torch.backends.cudnn.benchmark=False. In most of the cases, setting torch.backends.cudnn.benchmark=True is more memory-efficient.
  2. Depending on your environment and GPUs, you might experience MOTA jitter in your final models.
  3. You may see training noise during fine-tuning, especially for MOT17/MOT20 training with well-pretrained models. You can slow down the training rate by 1/10, apply early stopping, increase batch size with GPUs having more memory.
  4. If you have GPU memory issues, try to lower the batch size for training and evaluation in main_****.py, freeze the resnet backbone and use our coco/CH pretrained models.

Tracking

Using Public detections:

  • MOT17:
cd TransCenter_official
python ./tracking/transcenter/mot17_pub.py --data_dir=YourMOT17Path
  • MOT20:
cd TransCenter_official
python ./tracking/transcenter/mot20_pub.py --data_dir=YourMOT20Path

Using Private detections:

  • MOT17:
cd TransCenter_official
python ./tracking/transcenter/mot17_private.py --data_dir=YourMOT17Path
  • MOT20:
cd TransCenter_official
python ./tracking/transcenter/mot20_private.py --data_dir=YourMOT20Path

Notes:

  1. we recently corrected an image loading bug during reading certain images having an image ratio close to 1 (in MOT20) in the code, bringing better performance in MOT20.
  2. you can test your model by changing the model_path inside mot17[20]_private[pub].py.

MOTChallenge Results

MOT17 public detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 68.8% 79.9% 61.4% 22,860 149,188 4,102
CH 71.9% 81.4% 62.3% 17,378 137,008 4,046

MOT20 public detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 61.0% 79.5% 49.8% 49,189 147,890 4,493
CH 62.3% 79.9% 50.3% 43,006 147,505 4,545

MOT17 private detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 70.0% 79.6% 62.1% 28,119 136,722 4,647
CH 73.2% 81.1% 62.2% 23,112 123,738 4,614

MOT20 private detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 60.6% 79.5% 49.6% 52,332 146,809 4,604
CH 61.9% 79.9% 50.4% 45,895 146,347 4,653

Note:

  • The results can be slightly different depending on the running environment.
  • We might keep updating the results in the near future.

Acknowledgement

The code for TransCenter is modified and network pre-trained weights are obtained from the following repositories:

  1. The Person Reid Network (./tracking/transcenter/model_zoo/ResNet_iter_25245.pth) is from Tracktor.
  2. The lightflownet pretrained model (./tracking/transcenter/util/LiteFlownet/network-kitti.pytorch) is from pytorch-liteflownet and LiteFlowNet.
  3. The deformable transformer pretrained model (./model_zoo/r50_deformable_detr-checkpoint.pth) is from Deformable-DETR.
  4. The data format conversion code is modified from CenterTrack.

CenterTrack, Deformable-DETR, Tracktor.

@article{zhou2020tracking,
  title={Tracking Objects as Points},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  journal={ECCV},
  year={2020}
}

@InProceedings{tracktor_2019_ICCV,
author = {Bergmann, Philipp and Meinhardt, Tim and Leal{-}Taix{\'{e}}, Laura},
title = {Tracking Without Bells and Whistles},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}}

@article{zhu2020deformable,
  title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
  author={Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng},
  journal={arXiv preprint arXiv:2010.04159},
  year={2020}
}

Several modules are from:

MOT Metrics in Python: py-motmetrics

Soft-NMS: Soft-NMS

DETR: DETR

DCNv2: DCNv2

correlation_package: correlation_package

pytorch-liteflownet: pytorch-liteflownet

LiteFlowNet: LiteFlowNet

@InProceedings{hui18liteflownet,
    author = {Tak-Wai Hui and Xiaoou Tang and Chen Change Loy},
    title = {LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation},
    booktitle  = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2018},
    pages = {8981--8989},
    }

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published