Open-World Semi-Supervised Learning

Cao, Kaidi; Brbic, Maria; Leskovec, Jure

Computer Science > Machine Learning

arXiv:2102.03526 (cs)

[Submitted on 6 Feb 2021 (v1), last revised 25 Jan 2022 (this version, v3)]

Title:Open-World Semi-Supervised Learning

Authors:Kaidi Cao, Maria Brbic, Jure Leskovec

View PDF

Abstract:A fundamental limitation of applying semi-supervised learning in real-world settings is the assumption that unlabeled test data contains only classes previously encountered in the labeled training data. However, this assumption rarely holds for data in-the-wild, where instances belonging to novel classes may appear at testing time. Here, we introduce a novel open-world semi-supervised learning setting that formalizes the notion that novel classes may appear in the unlabeled test data. In this novel setting, the goal is to solve the class distribution mismatch between labeled and unlabeled data, where at the test time every input instance either needs to be classified into one of the existing classes or a new unseen class needs to be initialized. To tackle this challenging problem, we propose ORCA, an end-to-end deep learning approach that introduces uncertainty adaptive margin mechanism to circumvent the bias towards seen classes caused by learning discriminative features for seen classes faster than for the novel classes. In this way, ORCA reduces the gap between intra-class variance of seen with respect to novel classes. Experiments on image classification datasets and a single-cell annotation dataset demonstrate that ORCA consistently outperforms alternative baselines, achieving 25% improvement on seen and 96% improvement on novel classes of the ImageNet dataset.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2102.03526 [cs.LG]
	(or arXiv:2102.03526v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2102.03526

Submission history

From: Maria Brbic [view email]
[v1] Sat, 6 Feb 2021 07:11:07 UTC (1,124 KB)
[v2] Thu, 13 May 2021 07:25:25 UTC (1,115 KB)
[v3] Tue, 25 Jan 2022 23:13:10 UTC (6,727 KB)

Computer Science > Machine Learning

Title:Open-World Semi-Supervised Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Open-World Semi-Supervised Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators