Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training

02/17/2021
by   Kai Sheng Tai, et al.
0

Self-training is a standard approach to semi-supervised learning where the learner's own predictions on unlabeled data are used as supervision during training. In this paper, we reinterpret this label assignment process as an optimal transportation problem between examples and classes, wherein the cost of assigning an example to a class is mediated by the current predictions of the classifier. This formulation facilitates a practical annealing strategy for label assignment and allows for the inclusion of prior knowledge on class proportions via flexible upper bound constraints. The solutions to these assignment problems can be efficiently approximated using Sinkhorn iteration, thus enabling their use in the inner loop of standard stochastic optimization algorithms. We demonstrate the effectiveness of our algorithm on the CIFAR-10, CIFAR-100, and SVHN datasets in comparison with FixMatch, a state-of-the-art self-training algorithm. Additionally, we elucidate connections between our proposed algorithm and existing confidence thresholded self-training approaches in the context of homotopy methods in optimization. Our code is available at https://github.com/stanford-futuredata/sinkhorn-label-allocation.

READ FULL TEXT
research
07/23/2020

ReLaB: Reliable Label Bootstrapping for Semi-Supervised Learning

Reducing the amount of labels required to trainconvolutional neural netw...
research
06/25/2019

Semi-Supervised Learning with Self-Supervised Networks

Recent advances in semi-supervised learning have shown tremendous potent...
research
06/16/2020

Building One-Shot Semi-supervised (BOSS) Learning up to Fully Supervised Performance

Reaching the performance of fully supervised learning with unlabeled dat...
research
09/02/2021

Better Self-training for Image Classification through Self-supervision

Self-training is a simple semi-supervised learning approach: Unlabelled ...
research
06/29/2022

On Non-Random Missing Labels in Semi-Supervised Learning

Semi-Supervised Learning (SSL) is fundamentally a missing label problem,...
research
09/30/2021

Semi-Supervised Text Classification via Self-Pretraining

We present a neural semi-supervised learning model termed Self-Pretraini...
research
09/22/2022

DLUNet: Semi-supervised Learning based Dual-Light UNet for Multi-organ Segmentation

The manual ground truth of abdominal multi-organ is labor-intensive. In ...

Please sign up or login with your details

Forgot password? Click here to reset