AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data

06/14/2022
by   Amin Banitalebi-Dehkordi, et al.
0

Semi-supervised learning (SSL) has seen great strides when labeled data is scarce but unlabeled data is abundant. Critically, most recent work assume that such unlabeled data is drawn from the same distribution as the labeled data. In this work, we show that state-of-the-art SSL algorithms suffer a degradation in performance in the presence of unlabeled auxiliary data that does not necessarily possess the same class distribution as the labeled set. We term this problem as Auxiliary-SSL and propose AuxMix, an algorithm that leverages self-supervised learning tasks to learn generic features in order to mask auxiliary data that are not semantically similar to the labeled set. We also propose to regularize learning by maximizing the predicted entropy for dissimilar auxiliary samples. We show an improvement of 5 baselines on a ResNet-50 model when trained on CIFAR10 dataset with 4k labeled samples and all unlabeled data is drawn from the Tiny-ImageNet dataset. We report competitive results on several datasets and conduct ablation studies.

READ FULL TEXT
research
11/08/2021

TAGLETS: A System for Automatic Semi-Supervised Learning with Auxiliary Data

Machine learning practitioners often have access to a spectrum of data: ...
research
12/18/2019

RealMix: Towards Realistic Semi-Supervised Deep Learning Algorithms

Semi-Supervised Learning (SSL) algorithms have shown great potential in ...
research
08/25/2020

Learning to Learn in a Semi-Supervised Fashion

To address semi-supervised learning from both labeled and unlabeled data...
research
01/19/2021

On The Consistency Training for Open-Set Semi-Supervised Learning

Conventional semi-supervised learning (SSL) methods, e.g., MixMatch, ach...
research
08/27/2023

Pruning the Unlabeled Data to Improve Semi-Supervised Learning

In the domain of semi-supervised learning (SSL), the conventional approa...
research
02/17/2019

Exploiting Unlabeled Data in CNNs by Self-supervised Learning to Rank

For many applications the collection of labeled data is expensive labori...
research
01/13/2019

Gradient Regularized Budgeted Boosting

As machine learning transitions increasingly towards real world applicat...

Please sign up or login with your details

Forgot password? Click here to reset