Neural Active Learning on Heteroskedastic Distributions

11/02/2022
by   Savya Khosla, et al.
0

Models that can actively seek out the best quality training data hold the promise of more accurate, adaptable, and efficient machine learning. State-of-the-art active learning techniques tend to prefer examples that are the most difficult to classify. While this works well on homogeneous datasets, we find that it can lead to catastrophic failures when performed on multiple distributions with different degrees of label noise or heteroskedasticity. These active learning algorithms strongly prefer to draw from the distribution with more noise, even if their examples have no informative structure (such as solid color images with random labels). To this end, we demonstrate the catastrophic failure of these active learning algorithms on heteroskedastic distributions and propose a fine-tuning-based approach to mitigate these failures. Further, we propose a new algorithm that incorporates a model difference scoring function for each data point to filter out the noisy examples and sample clean examples that maximize accuracy, outperforming the existing active learning techniques on the heteroskedastic datasets. We hope these observations and techniques are immediately helpful to practitioners and can help to challenge common assumptions in the design of active learning algorithms.

READ FULL TEXT

page 2

page 4

page 5

research
08/10/2014

Exponentiated Gradient Exploration for Active Learning

Active learning strategies respond to the costly labelling task in a sup...
research
12/05/2021

Robust Active Learning: Sample-Efficient Training of Robust Deep Learning Models

Active learning is an established technique to reduce the labeling cost ...
research
07/11/2013

Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy

We describe a framework for designing efficient active learning algorith...
research
12/17/2019

When Your Robot Breaks: Active Learning During Plant Failure

Detecting and adapting to catastrophic failures in robotic systems requi...
research
02/27/2018

Adversarial Active Learning for Deep Networks: a Margin Based Approach

We propose a new active learning strategy designed for deep neural netwo...
research
06/17/2022

Towards Efficient Active Learning of PDFA

We propose a new active learning algorithm for PDFA based on three main ...
research
11/20/2022

Finding active galactic nuclei through Fink

We present the Active Galactic Nuclei (AGN) classifier as currently impl...

Please sign up or login with your details

Forgot password? Click here to reset