BALanCe: Deep Bayesian Active Learning via Equivalence Class Annealing

12/27/2021
by   Renyu Zhang, et al.
5

Active learning has demonstrated data efficiency in many fields. Existing active learning algorithms, especially in the context of deep Bayesian active models, rely heavily on the quality of uncertainty estimations of the model. However, such uncertainty estimates could be heavily biased, especially with limited and imbalanced training data. In this paper, we propose BALanCe, a Bayesian deep active learning framework that mitigates the effect of such biases. Concretely, BALanCe employs a novel acquisition function which leverages the structure captured by equivalence hypothesis classes and facilitates differentiation among different equivalence classes. Intuitively, each equivalence class consists of instantiations of deep models with similar predictions, and BALanCe adaptively adjusts the size of the equivalence classes as learning progresses. Besides the fully sequential setting, we further propose Batch-BALanCe – a generalization of the sequential algorithm to the batched setting – to efficiently select batches of training examples that are jointly effective for model improvement. We show that Batch-BALanCe achieves state-of-the-art performance on several benchmark datasets for active learning, and that both algorithms can effectively handle realistic challenges that often involve multi-class and imbalanced data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2023

Algorithm Selection for Deep Active Learning with Imbalanced Datasets

Label efficiency has become an increasingly important objective in deep ...
research
01/17/2019

Diverse mini-batch Active Learning

We study the problem of reducing the amount of labeled training data req...
research
03/08/2017

Deep Bayesian Active Learning with Image Data

Even though active learning forms an important pillar of machine learnin...
research
10/27/2019

Prediction stability as a criterion in active learning

Recent breakthroughs made by deep learning rely heavily on large number ...
research
04/23/2021

Active Learning of Sequential Transducers with Side Information about the Domain

Active learning is a setting in which a student queries a teacher, throu...
research
12/19/2020

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Large scale machine learning and deep models are extremely data-hungry. ...
research
01/13/2023

Scalable Batch Acquisition for Deep Bayesian Active Learning

In deep active learning, it is especially important to choose multiple e...

Please sign up or login with your details

Forgot password? Click here to reset