Towards Robust and Reproducible Active Learning Using Neural Networks

02/21/2020
by   Prateek Munjal, et al.
0

Active learning (AL) is a promising ML paradigm that has the potential to parse through large unlabeled data and help reduce annotation cost in domains where labeling entire data can be prohibitive. Recently proposed neural network based AL methods use different heuristics to accomplish this goal. In this study, we show that recent AL methods offer a gain over random baseline under a brittle combination of experimental conditions. We demonstrate that such marginal gains vanish when experimental factors are changed, leading to reproducibility issues and suggesting that AL methods lack robustness. We also observe that with a properly tuned model, which employs recently proposed regularization techniques, the performance significantly improves for all AL methods including the random sampling baseline, and performance differences among the AL methods become negligible. Based on these observations, we suggest a set of experiments that are critical to assess the true effectiveness of an AL method. To facilitate these experiments we also present an open source toolkit. We believe our findings and recommendations will help advance reproducible research in robust AL using neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2023

On the Limitations of Simulating Active Learning

Active learning (AL) is a human-and-model-in-the-loop paradigm that iter...
research
08/06/2014

When does Active Learning Work?

Active Learning (AL) methods seek to improve classifier performance when...
research
11/25/2021

Active Learning at the ImageNet Scale

Active learning (AL) algorithms aim to identify an optimal subset of dat...
research
12/29/2021

Active Learning-Based Optimization of Scientific Experimental Design

Active learning (AL) is a machine learning algorithm that can achieve gr...
research
03/25/2022

A Comparative Survey of Deep Active Learning

Active Learning (AL) is a set of techniques for reducing labeling cost b...
research
05/23/2022

PyRelationAL: A Library for Active Learning Research and Development

In constrained real-world scenarios where it is challenging or costly to...
research
03/30/2016

Robustness of Bayesian Pool-based Active Learning Against Prior Misspecification

We study the robustness of active learning (AL) algorithms against prior...

Please sign up or login with your details

Forgot password? Click here to reset