Active Learning under Pool Set Distribution Shift and Noisy Data

06/22/2021
by   Andreas Kirsch, et al.
39

Active Learning is essential for more label-efficient deep learning. Bayesian Active Learning has focused on BALD, which reduces model parameter uncertainty. However, we show that BALD gets stuck on out-of-distribution or junk data that is not relevant for the task. We examine a novel *Expected Predictive Information Gain (EPIG)* to deal with distribution shifts of the pool set. EPIG reduces the uncertainty of *predictions* on an unlabelled *evaluation set* sampled from the test data distribution whose distribution might be different to the pool set distribution. Based on this, our new EPIG-BALD acquisition function for Bayesian Neural Networks selects samples to improve the performance on the test data distribution instead of selecting samples that reduce model uncertainty everywhere, including for out-of-distribution regions with low density in the test data distribution. Our method outperforms state-of-the-art Bayesian active learning methods on high-dimensional datasets and avoids out-of-distribution junk data in cases where current state-of-the-art methods fail.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2020

Task-Aware Variational Adversarial Active Learning

Deep learning has achieved remarkable performance in various tasks thank...
research
08/20/2018

Adversarial Sampling for Active Learning

This paper describes ASAL a new active learning strategy that uses uncer...
research
03/01/2020

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision

Active learning (AL) aims to minimize labeling efforts for data-demandin...
research
08/28/2023

Maturity-Aware Active Learning for Semantic Segmentation with Hierarchically-Adaptive Sample Assessment

Active Learning (AL) for semantic segmentation is challenging due to hea...
research
08/14/2021

Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations

Our goal is to evaluate the accuracy of a black-box classification model...
research
05/06/2021

Bayesian Active Learning by Disagreements: A Geometric Perspective

We present geometric Bayesian active learning by disagreements (GBALD), ...
research
06/22/2021

MEAL: Manifold Embedding-based Active Learning

Image segmentation is a common and challenging task in autonomous drivin...

Please sign up or login with your details

Forgot password? Click here to reset