Log In Sign Up

On The Consistency Training for Open-Set Semi-Supervised Learning

by   Huixiang Luo, et al.

Conventional semi-supervised learning (SSL) methods, e.g., MixMatch, achieve great performance when both labeled and unlabeled dataset are drawn from the same distribution. However, these methods often suffer severe performance degradation in a more realistic setting, where unlabeled dataset contains out-of-distribution (OOD) samples. Recent approaches mitigate the negative influence of OOD samples by filtering them out from the unlabeled data. Our studies show that it is not necessary to get rid of OOD samples during training. On the contrary, the network can benefit from them if OOD samples are properly utilized. We thoroughly study how OOD samples affect DNN training in both low- and high-dimensional spaces, where two fundamental SSL methods are considered: Pseudo Labeling (PL) and Data Augmentation based Consistency Training (DACT). Conclusion is twofold: (1) unlike PL that suffers performance degradation, DACT brings improvement to model performance; (2) the improvement is closely related to class-wise distribution gap between the labeled and the unlabeled dataset. Motivated by this observation, we further improve the model performance by bridging the gap between the labeled and the unlabeled datasets (containing OOD samples). Compared to previous algorithms paying much attention to distinguishing between ID and OOD samples, our method makes better use of OOD samples and achieves state-of-the-art results.


AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data

Semi-supervised learning (SSL) has seen great strides when labeled data ...

Semi-supervised learning method based on predefined evenly-distributed class centroids

Compared to supervised learning, semi-supervised learning reduces the de...

EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning

Recent state-of-the-art methods in semi-supervised learning (SSL) combin...

On the Importance of Calibration in Semi-supervised Learning

State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been...

Minimax Lower Bounds for Realizable Transductive Classification

Transductive learning considers a training set of m labeled samples and ...

Prompt-driven efficient Open-set Semi-supervised Learning

Open-set semi-supervised learning (OSSL) has attracted growing interest,...

Virtual Adversarial Ladder Networks For Semi-supervised Learning

Semi-supervised learning (SSL) partially circumvents the high cost of la...