Log In Sign Up

Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning

by   Youngtaek Oh, et al.

The capability of the traditional semi-supervised learning (SSL) methods is far from real-world application since they do not consider (1) class imbalance and (2) class distribution mismatch between labeled and unlabeled data. This paper addresses such a relatively under-explored problem, imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance. Interestingly, we find that the semantic pseudo-labels from a similarity-based classifier in feature space and the traditional pseudo-labels from the linear classifier show the complementary property. To this end, we propose a general pseudo-labeling framework to address the bias motivated by this observation. The key idea is to class-adaptively blend the semantic pseudo-label to the linear one, depending on the current pseudo-label distribution. Thereby, the increased semantic pseudo-label component suppresses the false positives in the majority classes and vice versa. We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label. Extensive evaluation on CIFAR10/100-LT and STL10-LT shows that DASO consistently outperforms both recently proposed re-balancing methods for label and pseudo-label. Moreover, we demonstrate that typical SSL algorithms can effectively benefit from unlabeled data with DASO, especially when (1) class imbalance and (2) class distribution mismatch exist and even on recent real-world Semi-Aves benchmark.


page 3

page 23


Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning

While semi-supervised learning (SSL) has proven to be a promising way fo...

Learning to Adapt Classifier for Imbalanced Semi-supervised Learning

Pseudo-labeling has proven to be a promising semi-supervised learning (S...

An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised Learning

Semi-supervised learning (SSL) has shown great promise in leveraging unl...

Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning

Semi-Supervised Learning (SSL) has shown its strong ability in utilizing...

Transfer and Share: Semi-Supervised Learning from Long-Tailed Data

Long-Tailed Semi-Supervised Learning (LTSSL) aims to learn from class-im...

On Non-Random Missing Labels in Semi-Supervised Learning

Semi-Supervised Learning (SSL) is fundamentally a missing label problem,...

Calibrating Label Distribution for Class-Imbalanced Barely-Supervised Knee Segmentation

Segmentation of 3D knee MR images is important for the assessment of ost...