Combining Self-labeling with Selective Sampling

01/11/2023
by   Jędrzej Kozal, et al.
0

Since data is the fuel that drives machine learning models, and access to labeled data is generally expensive, semi-supervised methods are constantly popular. They enable the acquisition of large datasets without the need for too many expert labels. This work combines self-labeling techniques with active learning in a selective sampling scenario. We propose a new method that builds an ensemble classifier. Based on an evaluation of the inconsistency of the decisions of the individual base classifiers for a given observation, a decision is made on whether to request a new label or use the self-labeling. In preliminary studies, we show that naive application of self-labeling can harm performance by introducing bias towards selected classes and consequently lead to skewed class distribution. Hence, we also propose mechanisms to reduce this phenomenon. Experimental evaluation shows that the proposed method matches current selective sampling methods or achieves better results.

READ FULL TEXT
research
07/25/2022

Exploiting Diversity of Unlabeled Data for Label-Efficient Semi-Supervised Active Learning

The availability of large labeled datasets is the key component for the ...
research
04/07/2023

ASPEST: Bridging the Gap Between Active Learning and Selective Prediction

Selective prediction aims to learn a reliable model that abstains from m...
research
05/22/2023

Label Smarter, Not Harder: CleverLabel for Faster Annotation of Ambiguous Image Classification with Higher Quality

High-quality data is crucial for the success of machine learning, but la...
research
05/17/2023

Cold PAWS: Unsupervised class discovery and the cold-start problem

In many machine learning applications, labeling datasets can be an arduo...
research
07/02/2018

Learning under selective labels in the presence of expert consistency

We explore the problem of learning under selective labels in the context...
research
05/12/2021

Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

Negative sampling schemes enable efficient training given a large number...
research
04/10/2022

On Principal Curve-Based Classifiers and Similarity-Based Selective Sampling in Time-Series

Considering the concept of time-dilation, there exist some major issues ...

Please sign up or login with your details

Forgot password? Click here to reset