SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption

Bahri, Dara; Jiang, Heinrich; Tay, Yi; Metzler, Donald

Computer Science > Machine Learning

arXiv:2106.15147v2 (cs)

[Submitted on 29 Jun 2021 (v1), last revised 15 Mar 2022 (this version, v2)]

Title:SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption

Authors:Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler

View PDF

Abstract:Self-supervised contrastive representation learning has proved incredibly successful in the vision and natural language domains, enabling state-of-the-art performance with orders of magnitude less labeled data. However, such methods are domain-specific and little has been done to leverage this technique on real-world tabular datasets. We propose SCARF, a simple, widely-applicable technique for contrastive learning, where views are formed by corrupting a random subset of features. When applied to pre-train deep neural networks on the 69 real-world, tabular classification datasets from the OpenML-CC18 benchmark, SCARF not only improves classification accuracy in the fully-supervised setting but does so also in the presence of label noise and in the semi-supervised setting where only a fraction of the available training data is labeled. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders. We conduct comprehensive ablations, detailing the importance of a range of factors.

Comments:	ICLR 2022 Spotlight
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2106.15147 [cs.LG]
	(or arXiv:2106.15147v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2106.15147

Submission history

From: Dara Bahri [view email]
[v1] Tue, 29 Jun 2021 08:08:33 UTC (23,857 KB)
[v2] Tue, 15 Mar 2022 22:16:20 UTC (13,164 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-06

Change to browse by:

cs
cs.AI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Heinrich Jiang
Yi Tay
Donald Metzler

export BibTeX citation

Computer Science > Machine Learning

Title:SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators