arXiv version: https://arxiv.org/abs/2010.05537
If you think our work is helpful, please cite
@article{liu2021learning,
title={Learning Selective Mutual Attention and Contrast for RGB-D Saliency Detection},
author={Liu, Nian and Zhang, Ni and Shao, Ling and Han, Junwei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2021}
}
We construct a new large-scale challenging dataset ReDWeb-S and it has totally 3179 images with various real-world scenes and high-quality depth maps. We split the dataset into a training set with 2179 RGB-D image pairs and a testing set with the remaining 1000 image pairs.
The proposed dataset link can be found here. [baidu pan fetch code: rp8b | Google drive]
We analyze the proposed ReDWeb-S datset from several statistical aspects and also conduct a comparison between ReDWeb-S and other existing RGB-D SOD datasets.
Fig.1. Top 60% scene and object category distributions of our proposed ReDWeb-S dataset.
Fig.2. Comparison of nine RGB-D SOD dataset in terms of the distributions of global contrast and interior contrast.
Fig.3. Comparsion of the average annotation maps for nine RGB-D SOD benchmark datasets.
Fig.4. Comparsion of the distribution of object size for nine RGB-D SOD benchmark datasets.
We provide other SOTA RGB-D methods' results and scores on our proposed dataset. You can directly download all results [here ov08].
No. | Pub. | Name | Title | Download |
---|---|---|---|---|
00 | TIP2023 | Caver | Caver: Cross-modal view-mixed transformer for bi-modal salient object detection | results, 2kfm |
01 | TCSVT2022 | HRTransNet | HRTransNet: HRFormer-Driven Two-Modality Salient Object Detection | results, azjb |
02 | TCSVT2021 | SwinNet | SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection | results, zf9s |
03 | ICCV2021 | CMINet | RGB-D Saliency Detection via Cascaded Mutual Information Minimization | results, maav |
04 | ICCV2021 | VST | Visual Saliency Transformer | results, rkq9 |
05 | ICCV2021 | SPNet | Specificity-preserving RGB-D Saliency Detection | results, wwup |
06 | CVPR2021 | DCF | Calibrated RGB-D Salient Object Detection | results, 3kn9 |
07 | ECCV2020 | PGAR | Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection | results, mwtr |
08 | ECCV2020 | HDFNet | Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection | results, b98z |
09 | ECCV2020 | DANet | A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection | results, 1luj |
10 | ECCV2020 | CoNet | Accurate RGB-D Salient Object Detection via Collaborative Learning | results, bqq6 |
11 | ECCV2020 | CMWNet | Cross-Modal Weighting Network for RGB-D Salient Object Detection | results, ztv9 |
12 | ECCV2020 | cmMS | RGB-D Salient Object Detection with Cross-Modality Modulation and Selection | results, kwe5 |
13 | ECCV2020 | BBS-Net | BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network | results, ya5v |
14 | ECCV2020 | ATSA | Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection | results, k750 |
15 | CVPR2020 | S2MA | Learning Selective Self-Mutual Attention for RGB-D Saliency Detection | results, g0pgx |
16 | CVPR2020 | JL-DCF | JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection | results, xh9p |
17 | CVPR2020 | UCNet | UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders | results, 6o93 |
18 | CVPR2020 | A2dele | A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection | results, swv5 |
19 | CVPR2020 | SSF-RGBD | Select, Supplement and Focus for RGB-D Saliency Detection | results, oshl |
20 | TIP2020 | DisenFusion | RGBD Salient Object Detection via Disentangled Cross-Modal Fusion | results, h3hc |
21 | TNNLS2020 | D3Net | D3Net:Rethinking RGB-D Salient Object Detection: Models, Datasets, and Large-Scale Benchmarks | results, tetn |
22 | ICCV2019 | DMRA | Depth-induced multi-scale recurrent attention network for saliency detection | results, kqq4 |
23 | CVPR2019 | CPFP | Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection | results, 0v2c |
24 | TIP2019 | TANet | Three-stream attention-aware network for RGB-D salient object detection | results, hsy9 |
25 | CVPR2018 | PCF | Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection | results, qzhm |
26 | PR2019 | MMCI | Multi-modal fusion network with multiscale multi-path and cross-modal interactions for RGB-D salient object detection | results, c90m |
27 | TCyb2017 | CTMF | CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion | results, i0zb |
28 | Access2019 | AFNet | Adaptive fusion for rgb-d salient object detection | results, 54zc |
29 | TIP2017 | DF | Rgbd salient object detection via deep fusion | results, d7sc |
30 | ICME2016 | SE | Salient object detection for rgb-d image via saliency evolution | results, h10s |
31 | SPL2016 | DCMC | Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion | results, 18po |
32 | CVPR2016 | LBE | Local background enclosure for rgb-d salient object detection | results, iiz5 |
Methods | S-measure | maxF | E-measure | MAE |
---|---|---|---|---|
S2MA | 0.711 | 0.696 | 0.781 | 0.139 |
JL-DCF | 0.734 | 0.727 | 0.805 | 0.128 |
UCNet | 0.713 | 0.71 | 0.794 | 0.13 |
A2dele | 0.641 | 0.603 | 0.672 | 0.16 |
SSF-RGBD | 0.595 | 0.558 | 0.71 | 0.189 |
DisenFusion | 0.675 | 0.658 | 0.76 | 0.16 |
D3Net | 0.689 | 0.673 | 0.768 | 0.149 |
DMRA | 0.592 | 0.579 | 0.721 | 0.188 |
CPFP | 0.685 | 0.645 | 0.744 | 0.142 |
TANet | 0.656 | 0.623 | 0.741 | 0.165 |
PCF | 0.655 | 0.627 | 0.743 | 0.166 |
MMCI | 0.660 | 0.641 | 0.754 | 0.176 |
CTMF | 0.641 | 0.607 | 0.739 | 0.204 |
AFNet | 0.546 | 0.549 | 0.693 | 0.213 |
DF | 0.595 | 0.579 | 0.683 | 0.233 |
SE | 0.435 | 0.393 | 0.587 | 0.283 |
DCMC | 0.427 | 0.348 | 0.549 | 0.313 |
LBE | 0.637 | 0.629 | 0.73 | 0.253 |
We thank all annotators for helping us constructing the proposed dataset. Our proposed dataset is based on the ReDWeb dataset, which is a state-of-the-art dataset proposed for monocular image depth estimation. We also thank the authors for providing the ReDWeb dataset.
If you have any questions, please feel free to contact me. ([email protected])