ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring

11/21/2019
by   David Berthelot, et al.
24

We improve the recently-proposed "MixMatch" semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring. Distribution alignment encourages the marginal distribution of predictions on unlabeled data to be close to the marginal distribution of ground-truth labels. Augmentation anchoring feeds multiple strongly augmented versions of an input into the model and encourages each output to be close to the prediction for a weakly-augmented version of the same input. To produce strong augmentations, we propose a variant of AutoAugment which learns the augmentation policy while the model is being trained. Our new algorithm, dubbed ReMixMatch, is significantly more data-efficient than prior work, requiring between 5× and 16× less data to reach the same accuracy. For example, on CIFAR-10 with 250 labeled examples we reach 93.73% accuracy (compared to MixMatch's accuracy of 93.58% with 4,000 examples) and a median accuracy of 84.92% with just four labels per class. We make our code and data open-source at https://github.com/google-research/remixmatch.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2020

Milking CowMask for Semi-Supervised Image Classification

Consistency regularization is a technique for semi-supervised learning t...
research
02/06/2019

Semi-Supervised Learning by Label Gradient Alignment

We present label gradient alignment, a novel algorithm for semi-supervis...
research
05/06/2019

MixMatch: A Holistic Approach to Semi-Supervised Learning

Semi-supervised learning has proven to be a powerful paradigm for levera...
research
03/14/2022

SimMatch: Semi-supervised Learning with Similarity Matching

Learning with few labeled data has been a longstanding problem in the co...
research
10/08/2021

Phone-to-audio alignment without text: A Semi-supervised Approach

The task of phone-to-audio alignment has many applications in speech res...
research
09/23/2022

BioKlustering: a web app for semi-supervised learning of maximally imbalanced genomic data

Summary: Accurate phenotype prediction from genomic sequences is a highl...
research
03/23/2020

Meta Pseudo Labels

Many training algorithms of a deep neural network can be interpreted as ...

Please sign up or login with your details

Forgot password? Click here to reset