Confident Sinkhorn Allocation for Pseudo-Labeling

06/13/2022
by   Vu Nguyen, et al.
19

Semi-supervised learning is a critical tool in reducing machine learning's dependence on labeled data. It has, however, been applied primarily to image and language data, by exploiting the inherent spatial and semantic structure therein. These methods do not apply to tabular data because these domain structures are not available. Existing pseudo-labeling (PL) methods can be effective for tabular data but are vulnerable to noise samples and to greedy assignments given a predefined threshold which is unknown. This paper addresses this problem by proposing a Confident Sinkhorn Allocation (CSA), which assigns labels to only samples with high confidence scores and learns the best label allocation via optimal transport. CSA outperforms the current state-of-the-art in this practically important area.

READ FULL TEXT

page 3

page 9

page 18

page 22

research
06/13/2022

EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning

Recent state-of-the-art methods in semi-supervised learning (SSL) combin...
research
03/13/2023

InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning

Recent state-of-the-art methods in imbalanced semi-supervised learning (...
research
02/27/2023

Revisiting Self-Training with Regularized Pseudo-Labeling for Tabular Data

Recent progress in semi- and self-supervised learning has caused a rift ...
research
01/26/2023

SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning

The critical challenge of Semi-Supervised Learning (SSL) is how to effec...
research
04/28/2021

Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples

This paper proposes a novel method of learning by predicting view assign...
research
08/13/2021

Progressive Representative Labeling for Deep Semi-Supervised Learning

Deep semi-supervised learning (SSL) has experienced significant attentio...
research
08/12/2023

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition

When labeled data is insufficient, semi-supervised learning with the pse...

Please sign up or login with your details

Forgot password? Click here to reset