CiVOS : Revisiting Click-Based Interactive Video Object Segmentation

Stephane Vujasinovic, Sebastian Bullinger, Stefan Becker, Norbert Scherer-Negenborn, Michael Arens, Rainer Stiefelhagen

ICIP 2022

📰 New Project (18/09/2023):
READMem: Robust Embedding Association for a Diverse Memory in Unconstrained Video Object Segmentation

TL;DR: We manage the memory of STM like sVOS methods to better deal with long video. To attain long-term performance we estimate the inter-frame diversity of the base memory and integrate the embeddings of an incoming frame into the memory if it enhances the diversity. In return, we are able to limit the number of memory slots and deal with unconstrained video sequences without hindering the performance on short sequences and alleviate the need for a sampling interval.

Paper

Abstract

While current methods for interactive Video Object Segmentation (iVOS) rely on scribble-based interactions to generate precise object masks, we propose a Click-based interactive Video Object Segmentation (CiVOS) framework to simplify the required user workload as much as possible. CiVOS builds on de-coupled modules reflecting user interaction and mask propagation. The interaction module converts click-based interactions into an object mask, which is then inferred to the remaining frames by the propagation module. Additional user interactions allow for a refinement of the object mask. The approach is extensively evaluated on the popular interactive DAVIS dataset, but with an inevitable adaptation of scribble-based interactions with click-based counterparts. We consider several strategies for generating clicks during our evaluation to reflect various user inputs and adjust the DAVIS performance metric to perform a hardware-independent comparison. The presented CiVOS pipeline achieves competitive results, although requiring a lower user workload.

[Paper] [ArXiv] [PDF]

@INPROCEEDINGS{Vujasinović_2021_ICIP,
  author={Vujasinović, Stéphane and Bullinger, Sebastian and Becker, Stefan and Scherer-Negenborn, Norbert and Arens, Michael and Stiefelhagen, Rainer},
  booktitle={2022 IEEE International Conference on Image Processing (ICIP)}, 
  title={Revisiting Click-Based Interactive Video Object Segmentation}, 
  year={2022},
  pages={2756-2760},
  doi={10.1109/ICIP46576.2022.9897460}}

Setting up the environment

The framework is built with Python 3.7 and relies on the following packages:
- NumPy 1.21.4
- SciPy 1.7.2
- PyTorch 1.10.0
- torchvision 0.11.1
- OpenCV 4.5.4 (opencv-python-headless if you don't want to use the demo)
- Cython 0.29.24
- scikit-learn 0.20.3
- scikit-image 0.18.3
- scipy 1.7.2
- Pillow 8.4.0
- imgaug 0.4.0
- albumentations 1.10
- tqdm 4.62.3
- PyYaml 6.0
- easydict 1.9
- future 0.18.2
- cffi 1.15.0
- davis-interactive 1.0.4
- networkx 2.6.3 for DAVIS
- gdown 4.2.0 for downloading pretrained models
- tensorboard 2.4.1
Download the DAVIS dataset download_datasets.py
Download the pretrained models download_models.py

Guide for Demo

Adapt the paths and variables in Demo.yml
Launch CiVOS_Demo.py (Nota bene: only 1 object can be segmented in the Demo)
Mouse and keyboard bindings:
- Positive interaction: left mouse click
- Negative interaction: right mouse click
- Predict a mask of the object of interest for the video sequence: space bar
- Visualize the results with the keys x(forward direction) and y(backward direction)
- Quit the demo with key q

How to evaluate on DAVIS

Adapt the paths and variables of EXAMPLE_DEBUGGING.yml
Adapt and lauch the bash file CiVOS_evaluation_script_example.sh
Read .csv files results with Summarize_with_DAVIS_arbitrary_report.py

Results

Quantitative evaluation on the interactive DAVIS 2017 validation set.

Methods	Training interaction	Testing interaction	R-AUC-J&F	AUC-J&F	J&F@60s
MANet	Scribbles	Scribbles	0.72	0.79	0.79
ATNet	Scribbles	Scribbles	0.75	0.80	0.80
MiVOS	Scribbles	Scribbles	0.81	0.87	0.88
GIS-RAmap	Scribbles	Scribbles	0.79	0.86	0.87
MiVOS	Clicks	Clicks	0.70	0.78	0.79
CiVOS	Clicks	Clicks	0.76	0.83	0.84

R-AUC-J&F results on the DAVIS 2017 validation set for CiVOS by generating clicks in three different ways.

Maximal Number of Clicks	1	2	3	4	5	6	7
Interaction Strategy 1	0.69	-	-	-	-	-	-
Interaction Strategy 2	0.72	0.76	0.76	0.75	0.75	0.75	0.76
Interaction Strategy 3	0.74	0.77	0.78	0.78	0.78	0.78	0.78

Credits

RiTM: GitHub, Paper

MiVOS: GitHub, Paper

DeepLabV3Plus: GitHub, Paper

DAVIS-interactive: GitHub, Project

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
RiTM		RiTM
click_davisinteractive		click_davisinteractive
dataset		dataset
evaluation_space		evaluation_space
model		model
my_utils		my_utils
scripts		scripts
util		util
Architecture.jpg		Architecture.jpg
CiVOS_Demo.py		CiVOS_Demo.py
CiVOS_evaluation_script_example.sh		CiVOS_evaluation_script_example.sh
CiVOS_for_DAVIS.py		CiVOS_for_DAVIS.py
CiVOS_pipeline.py		CiVOS_pipeline.py
Demo.yml		Demo.yml
LICENSE		LICENSE
README.md		README.md
download_datasets.py		download_datasets.py
download_models.py		download_models.py
inference_core.py		inference_core.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CiVOS : Revisiting Click-Based Interactive Video Object Segmentation

Paper

Setting up the environment

Guide for Demo

How to evaluate on DAVIS

Results

Credits

About

Releases

Packages

Languages

License

ronghanghu/CiVOS

Folders and files

Latest commit

History

Repository files navigation

CiVOS : Revisiting Click-Based Interactive Video Object Segmentation

Paper

Setting up the environment

Guide for Demo

How to evaluate on DAVIS

Results

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages