Skip to content

fbragman/ModGenVIS

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Generalized Framework for Video Instance Segmentation (CVPR 2023)

Miran Heo, Sukjun Hwang, Jeongseok Hyun, Hanjung Kim, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

[arXiv] [BibTeX]


Updates

  • Feb 28, 2023: GenVIS is accepted to CVPR 2023!
  • Jan 20, 2023: Code is now available!

Installation

GenVIS is built upon VITA. See installation instructions.

Getting Started

We provide a script train_net_genvis.py, that is made to train all the configs provided in GenVIS.

To train a model with "train_net_genvis.py" on VIS, first setup the corresponding datasets following Preparing Datasets.

Then run with pretrained weights on target VIS dataset in VITA's Model Zoo:

python train_net_genvis.py --num-gpus 4 \
  --config-file configs/genvis/ovis/genvis_R50_bs8_online.yaml \
  MODEL.WEIGHTS vita_r50_ovis.pth

To evaluate a model's performance, use

python train_net_genvis.py --num-gpus 4 \
  --config-file configs/genvis/ovis/genvis_R50_bs8_online.yaml \
  --eval-only MODEL.WEIGHTS /path/to/checkpoint_file

Model Zoo

Additional weights will be updated soon!

YouTubeVIS-2019

Backbone Method AP AP50 AP75 AR1 AR10 Download
R-50 online 50.0 71.5 54.6 49.5 59.7 model
R-50 semi-online 51.3 72.0 57.8 49.5 60.0 model
Swin-L online 64.0 84.9 68.3 56.1 69.4 model
Swin-L semi-online 63.8 85.7 68.5 56.3 68.4 model

YouTubeVIS-2021

Backbone Method AP AP50 AP75 AR1 AR10 Download
R-50 online 47.1 67.5 51.5 41.6 54.7 model
R-50 semi-online 46.3 67.0 50.2 40.6 53.2 model
Swin-L online 59.6 80.9 65.8 48.7 65.0 model
Swin-L semi-online 60.1 80.9 66.5 49.1 64.7 model

OVIS

Backbone Method AP AP50 AP75 AR1 AR10 Download
R-50 online 35.8 60.8 36.2 16.3 39.6 model
R-50 semi-online 34.5 59.4 35.0 16.6 38.3 model
Swin-L online 45.2 69.1 48.4 19.1 48.6 model
Swin-L semi-online 45.4 69.2 47.8 18.9 49.0 model

License

The majority of GenVIS is licensed under a Apache-2.0 License. However portions of the project are available under separate license terms: Detectron2(Apache-2.0 License), IFC(Apache-2.0 License), Mask2Former(MIT License), Deformable-DETR(Apache-2.0 License), and VITA(Apache-2.0 License).

Citing GenVIS

If you use GenVIS in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@inproceedings{GenVIS,
  title={A Generalized Framework for Video Instance Segmentation},
  author={Heo, Miran and Hwang, Sukjun and Hyun, Jeongseok and Kim, Hanjung and Oh, Seoung Wug and Lee, Joon-Young and Kim, Seon Joo},
  booktitle={CVPR},
  year={2023}
}

@inproceedings{VITA,
  title={VITA: Video Instance Segmentation via Object Token Association},
  author={Heo, Miran and Hwang, Sukjun and Oh, Seoung Wug and Lee, Joon-Young and Kim, Seon Joo},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

Acknowledgement

Our code is largely based on Detectron2, IFC, Mask2Former, Deformable DETR, and VITA. We are truly grateful for their excellent work.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 89.9%
  • Cuda 9.0%
  • Other 1.1%