InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning

03/13/2023
by   Zhuoran Yu, et al.
0

Recent state-of-the-art methods in imbalanced semi-supervised learning (SSL) rely on confidence-based pseudo-labeling with consistency regularization. To obtain high-quality pseudo-labels, a high confidence threshold is typically adopted. However, it has been shown that softmax-based confidence scores in deep networks can be arbitrarily high for samples far from the training data, and thus, the pseudo-labels for even high-confidence unlabeled samples may still be unreliable. In this work, we present a new perspective of pseudo-labeling for imbalanced SSL. Without relying on model confidence, we propose to measure whether an unlabeled sample is likely to be “in-distribution”; i.e., close to the current training data. To decide whether an unlabeled sample is “in-distribution” or “out-of-distribution”, we adopt the energy score from out-of-distribution detection literature. As training progresses and more unlabeled samples become in-distribution and contribute to training, the combined labeled and pseudo-labeled data can better approximate the true class distribution to improve the model. Experiments demonstrate that our energy-based pseudo-labeling method, InPL, albeit conceptually simple, significantly outperforms confidence-based methods on imbalanced SSL benchmarks. For example, it produces around 3% absolute accuracy improvement on CIFAR10-LT. When combined with state-of-the-art long-tailed SSL methods, further improvements are attained. In particular, in one of the most challenging scenarios, InPL achieves a 6.9% accuracy improvement over the best competitor.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2022

EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning

Recent state-of-the-art methods in semi-supervised learning (SSL) combin...
research
01/26/2023

SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning

The critical challenge of Semi-Supervised Learning (SSL) is how to effec...
research
01/09/2022

Learning class prototypes from Synthetic InSAR with Vision Transformers

The detection of early signs of volcanic unrest preceding an eruption, i...
research
06/13/2022

Confident Sinkhorn Allocation for Pseudo-Labeling

Semi-supervised learning is a critical tool in reducing machine learning...
research
01/15/2023

On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning

When there are unlabeled Out-Of-Distribution (OOD) data from other class...
research
01/05/2022

Debiased Learning from Naturally Imbalanced Pseudo-Labels for Zero-Shot and Semi-Supervised Learning

This work studies the bias issue of pseudo-labeling, a natural phenomeno...
research
08/13/2023

Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning

Semi-supervised learning is attracting blooming attention, due to its su...

Please sign up or login with your details

Forgot password? Click here to reset