Pruning the Unlabeled Data to Improve Semi-Supervised Learning

08/27/2023
by   Guy Hacohen, et al.
0

In the domain of semi-supervised learning (SSL), the conventional approach involves training a learner with a limited amount of labeled data alongside a substantial volume of unlabeled data, both drawn from the same underlying distribution. However, for deep learning models, this standard practice may not yield optimal results. In this research, we propose an alternative perspective, suggesting that distributions that are more readily separable could offer superior benefits to the learner as compared to the original distribution. To achieve this, we present PruneSSL, a practical technique for selectively removing examples from the original unlabeled dataset to enhance its separability. We present an empirical study, showing that although PruneSSL reduces the quantity of available training data for the learner, it significantly improves the performance of various competitive SSL algorithms, thereby achieving state-of-the-art results across several image classification tasks.

READ FULL TEXT

page 4

page 6

research
06/14/2022

AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data

Semi-supervised learning (SSL) has seen great strides when labeled data ...
research
12/04/2020

Matching Distributions via Optimal Transport for Semi-Supervised Learning

Semi-Supervised Learning (SSL) approaches have been an influential frame...
research
07/02/2020

Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning

Existing semi-supervised learning (SSL) algorithms use a single weight t...
research
12/03/2018

Clinical Document Classification Using Labeled and Unlabeled Data Across Hospitals

Reviewing radiology reports in emergency departments is an essential but...
research
04/28/2022

On tuning a mean-field model for semi-supervised classification

Semi-supervised learning (SSL) has become an interesting research area d...
research
05/02/2022

Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding

Semantic understanding of 3D point cloud relies on learning models with ...
research
08/18/2021

STAR: Noisy Semi-Supervised Transfer Learning for Visual Classification

Semi-supervised learning (SSL) has proven to be effective at leveraging ...

Please sign up or login with your details

Forgot password? Click here to reset