Low Budget Active Learning via Wasserstein Distance: An Integer Programming Approach

06/05/2021
by   Rafid Mahmood, et al.
49

Given restrictions on the availability of data, active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label. Although selecting the most useful points for training is an optimization problem, the scale of deep learning data sets forces most selection strategies to employ efficient heuristics. Instead, we propose a new integer optimization problem for selecting a core set that minimizes the discrete Wasserstein distance from the unlabeled pool. We demonstrate that this problem can be tractably solved with a Generalized Benders Decomposition algorithm. Our strategy requires high-quality latent features which we obtain by unsupervised learning on the unlabeled pool. Numerical results on several data sets show that our optimization approach is competitive with baselines and particularly outperforms them in the low budget regime where less than one percent of the data set is labeled.

READ FULL TEXT

page 21

page 22

page 23

page 24

research
10/22/2021

A Simple Baseline for Low-Budget Active Learning

Active learning focuses on choosing a subset of unlabeled data to be lab...
research
11/18/2019

The Effectiveness of Variational Autoencoders for Active Learning

The high cost of acquiring labels is one of the main challenges in deplo...
research
10/10/2019

Active Learning with Importance Sampling

We consider an active learning setting where the algorithm has access to...
research
07/04/2022

Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios

Pool-based Active Learning (AL) has achieved great success in minimizing...
research
07/20/2022

Stream-based active learning with linear models

The proliferation of automated data collection schemes and the advances ...
research
11/01/2017

Active Tolerant Testing

In this work, we give the first algorithms for tolerant testing of nontr...
research
11/22/2021

Active Learning Meets Optimized Item Selection

Designing recommendation systems with limited or no available training d...

Please sign up or login with your details

Forgot password? Click here to reset