Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing

He, Xingjian; Wang, Weining; Xu, Zhiyong; Wang, Hao; Jiang, Jie; Liu, Jing

Computer Science > Computer Vision and Pattern Recognition

arXiv:2109.02281v1 (cs)

[Submitted on 6 Sep 2021]

Title:Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing

Authors:Xingjian He, Weining Wang, Zhiyong Xu, Hao Wang, Jie Jiang, Jing Liu

View PDF

Abstract:Compared with image scene parsing, video scene parsing introduces temporal information, which can effectively improve the consistency and accuracy of prediction. In this paper, we propose a Spatial-Temporal Semantic Consistency method to capture class-exclusive context information. Specifically, we design a spatial-temporal consistency loss to constrain the semantic consistency in spatial and temporal dimensions. In addition, we adopt an pseudo-labeling strategy to enrich the training dataset. We obtain the scores of 59.84% and 58.85% mIoU on development (test part 1) and testing set of VSPW, respectively. And our method wins the 1st place on VSPW challenge at ICCV2021.

Comments:	1st Place technical report for "The 1st Video Scene Parsing in the Wild Challenge" at ICCV2021
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2109.02281 [cs.CV]
	(or arXiv:2109.02281v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2109.02281

Submission history

From: Xingjian He [view email]
[v1] Mon, 6 Sep 2021 08:24:38 UTC (468 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2021-09

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hao Wang
Jie Jiang
Jing Liu

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators