Exploiting Diversity of Unlabeled Data for Label-Efficient Semi-Supervised Active Learning

07/25/2022
by   Felix Buchert, et al.
0

The availability of large labeled datasets is the key component for the success of deep learning. However, annotating labels on large datasets is generally time-consuming and expensive. Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling. Diversity-based sampling algorithms are known as integral components of representation-based approaches for active learning. In this paper, we introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting. Self-supervised representation learning is used to consider the diversity of samples in the initial dataset selection algorithm. Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings. By considering the consistency information with the diversity in the consistency-based embedding scheme, the proposed method could select more informative samples for labeling in the semi-supervised learning setting. Comparative experiments show that the proposed method achieves compelling results on CIFAR-10 and Caltech-101 datasets compared with previous active learning approaches by utilizing the diversity of unlabeled data.

READ FULL TEXT

page 3

page 6

research
11/27/2020

Deep Active Learning for Sequence Labeling Based on Diversity and Uncertainty in Gradient

Recently, several studies have investigated active learning (AL) for nat...
research
12/02/2020

Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning

Active learning is widely used to reduce labeling effort and training ti...
research
07/22/2021

Active Learning in Incomplete Label Multiple Instance Multiple Label Learning

In multiple instance multiple label learning, each sample, a bag, consis...
research
06/10/2022

In Defense of Core-set: A Density-aware Core-set Selection for Active Learning

Active learning enables the efficient construction of a labeled dataset ...
research
01/11/2023

Combining Self-labeling with Selective Sampling

Since data is the fuel that drives machine learning models, and access t...
research
02/12/2018

Fast Interactive Image Retrieval using large-scale unlabeled data

An interactive image retrieval system learns which images in the databas...
research
09/20/2019

Sampling Bias in Deep Active Classification: An Empirical Study

The exploding cost and time needed for data labeling and model training ...

Please sign up or login with your details

Forgot password? Click here to reset