EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning

06/13/2022
by   Zhuoran Yu, et al.
0

Recent state-of-the-art methods in semi-supervised learning (SSL) combine consistency regularization with confidence-based pseudo-labeling. To obtain high-quality pseudo-labels, a high confidence threshold is typically adopted. However, it has been shown that softmax-based confidence scores in deep networks can be arbitrarily high for samples far from the training data, and thus, the pseudo-labels for even high-confidence unlabeled samples may still be unreliable. In this work, we present a new perspective of pseudo-labeling: instead of relying on model confidence, we instead measure whether an unlabeled sample is likely to be "in-distribution"; i.e., close to the current training data. To classify whether an unlabeled sample is "in-distribution" or "out-of-distribution", we adopt the energy score from out-of-distribution detection literature. As training progresses and more unlabeled samples become in-distribution and contribute to training, the combined labeled and pseudo-labeled data can better approximate the true distribution to improve the model. Experiments demonstrate that our energy-based pseudo-labeling method, albeit conceptually simple, significantly outperforms confidence-based methods on imbalanced SSL benchmarks, and achieves competitive performance on class-balanced data. For example, it produces a 4-6 improvement on CIFAR10-LT when the imbalance ratio is higher than 50. When combined with state-of-the-art long-tailed SSL methods, further improvements are attained.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2023

InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning

Recent state-of-the-art methods in imbalanced semi-supervised learning (...
research
01/15/2023

On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning

When there are unlabeled Out-Of-Distribution (OOD) data from other class...
research
06/13/2022

Confident Sinkhorn Allocation for Pseudo-Labeling

Semi-supervised learning is a critical tool in reducing machine learning...
research
01/26/2023

SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning

The critical challenge of Semi-Supervised Learning (SSL) is how to effec...
research
03/26/2023

SDTracker: Synthetic Data Based Multi-Object Tracking

We present SDTracker, a method that harnesses the potential of synthetic...
research
01/27/2022

Confidence May Cheat: Self-Training on Graph Neural Networks under Distribution Shift

Graph Convolutional Networks (GCNs) have recently attracted vast interes...
research
01/19/2021

On The Consistency Training for Open-Set Semi-Supervised Learning

Conventional semi-supervised learning (SSL) methods, e.g., MixMatch, ach...

Please sign up or login with your details

Forgot password? Click here to reset