Segmenting Transparent Object in the Wild with Transformer

Xie, Enze; Wang, Wenjia; Wang, Wenhai; Sun, Peize; Xu, Hang; Liang, Ding; Luo, Ping

Computer Science > Computer Vision and Pattern Recognition

arXiv:2101.08461 (cs)

[Submitted on 21 Jan 2021 (v1), last revised 23 Feb 2021 (this version, v3)]

Title:Segmenting Transparent Object in the Wild with Transformer

Authors:Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, Ping Luo

View PDF

Abstract:This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset. Unlike Trans10K-v1 that only has two limited categories, our new dataset has several appealing benefits. (1) It has 11 fine-grained categories of transparent objects, commonly occurring in the human domestic environment, making it more practical for real-world application. (2) Trans10K-v2 brings more challenges for the current advanced segmentation methods than its former version. Furthermore, a novel transformer-based segmentation pipeline termed Trans2Seg is proposed. Firstly, the transformer encoder of Trans2Seg provides the global receptive field in contrast to CNN's local receptive field, which shows excellent advantages over pure CNN architectures. Secondly, by formulating semantic segmentation as a problem of dictionary look-up, we design a set of learnable prototypes as the query of Trans2Seg's transformer decoder, where each prototype learns the statistics of one category in the whole dataset. We benchmark more than 20 recent semantic segmentation methods, demonstrating that Trans2Seg significantly outperforms all the CNN-based methods, showing the proposed algorithm's potential ability to solve transparent object segmentation.

Comments:	Tech. Report
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2101.08461 [cs.CV]
	(or arXiv:2101.08461v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2101.08461

Submission history

From: Enze Xie [view email]
[v1] Thu, 21 Jan 2021 06:41:00 UTC (5,003 KB)
[v2] Sat, 23 Jan 2021 05:14:35 UTC (5,003 KB)
[v3] Tue, 23 Feb 2021 13:23:16 UTC (4,836 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Segmenting Transparent Object in the Wild with Transformer

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Segmenting Transparent Object in the Wild with Transformer

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators