SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption

06/29/2021
by   Dara Bahri, et al.
7

Self-supervised contrastive representation learning has proved incredibly successful in the vision and natural language domains, enabling state-of-the-art performance with orders of magnitude less labeled data. However, such methods are domain-specific and little has been done to leverage this technique on real-world tabular datasets. We propose SCARF, a simple, widely-applicable technique for contrastive learning, where views are formed by corrupting a random subset of features. When applied to pre-train deep neural networks on the 69 real-world, tabular classification datasets from the OpenML-CC18 benchmark, SCARF not only improves classification accuracy in the fully-supervised setting but does so also in the presence of label noise and in the semi-supervised setting where only a fraction of the available training data is labeled. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders. We conduct comprehensive ablations, detailing the importance of a range of factors.

READ FULL TEXT

page 4

page 7

page 8

page 9

page 10

page 18

page 19

page 20

research
06/28/2023

Multi-network Contrastive Learning Based on Global and Local Representations

The popularity of self-supervised learning has made it possible to train...
research
11/17/2020

Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning

Whole slide images (WSIs) have large resolutions and usually lack locali...
research
09/17/2020

MoPro: Webly Supervised Learning with Momentum Prototypes

We propose a webly-supervised representation learning method that does n...
research
04/05/2023

Self-Supervised Siamese Autoencoders

Fully supervised models often require large amounts of labeled training ...
research
08/31/2021

ScatSimCLR: self-supervised contrastive learning with pretext task regularization for small-scale datasets

In this paper, we consider a problem of self-supervised learning for sma...
research
09/16/2021

DisUnknown: Distilling Unknown Factors for Disentanglement Learning

Disentangling data into interpretable and independent factors is critica...
research
05/31/2023

Morphological Classification of Radio Galaxies using Semi-Supervised Group Equivariant CNNs

Out of the estimated few trillion galaxies, only around a million have b...

Please sign up or login with your details

Forgot password? Click here to reset