On The Consistency Training for Open-Set Semi-Supervised Learning

01/19/2021
by   Huixiang Luo, et al.
0

Conventional semi-supervised learning (SSL) methods, e.g., MixMatch, achieve great performance when both labeled and unlabeled dataset are drawn from the same distribution. However, these methods often suffer severe performance degradation in a more realistic setting, where unlabeled dataset contains out-of-distribution (OOD) samples. Recent approaches mitigate the negative influence of OOD samples by filtering them out from the unlabeled data. Our studies show that it is not necessary to get rid of OOD samples during training. On the contrary, the network can benefit from them if OOD samples are properly utilized. We thoroughly study how OOD samples affect DNN training in both low- and high-dimensional spaces, where two fundamental SSL methods are considered: Pseudo Labeling (PL) and Data Augmentation based Consistency Training (DACT). Conclusion is twofold: (1) unlike PL that suffers performance degradation, DACT brings improvement to model performance; (2) the improvement is closely related to class-wise distribution gap between the labeled and the unlabeled dataset. Motivated by this observation, we further improve the model performance by bridging the gap between the labeled and the unlabeled datasets (containing OOD samples). Compared to previous algorithms paying much attention to distinguishing between ID and OOD samples, our method makes better use of OOD samples and achieves state-of-the-art results.

READ FULL TEXT
research
06/14/2022

AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data

Semi-supervised learning (SSL) has seen great strides when labeled data ...
research
08/08/2019

Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning

Semi-supervised learning, i.e. jointly learning from labeled an unlabele...
research
06/13/2022

EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning

Recent state-of-the-art methods in semi-supervised learning (SSL) combin...
research
01/15/2023

On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning

When there are unlabeled Out-Of-Distribution (OOD) data from other class...
research
02/09/2016

Minimax Lower Bounds for Realizable Transductive Classification

Transductive learning considers a training set of m labeled samples and ...
research
06/30/2023

Exploration and Exploitation of Unlabeled Data for Open-Set Semi-Supervised Learning

In this paper, we address a complex but practical scenario in semi-super...
research
08/29/2023

Prototype Fission: Closing Set for Robust Open-set Semi-supervised Learning

Semi-supervised Learning (SSL) has been proven vulnerable to out-of-dist...

Please sign up or login with your details

Forgot password? Click here to reset