Online Active Learning with Dynamic Marginal Gain Thresholding

01/25/2022
by   Mariel A. Werner, et al.
5

The blessing of ubiquitous data also comes with a curse: the communication, storage, and labeling of massive, mostly redundant datasets. In our work, we seek to solve the problem at its source, collecting only valuable data and throwing out the rest, via active learning. We propose an online algorithm which, given any stream of data, any assessment of its value, and any formulation of its selection cost, extracts the most valuable subset of the stream up to a constant factor while using minimal memory. Notably, our analysis also holds for the federated setting, in which multiple agents select online from individual data streams without coordination and with potentially very different appraisals of cost. One particularly important use case is selecting and labeling training sets from unlabeled collections of data that maximize the test-time performance of a given classifier. In prediction tasks on ImageNet and MNIST, we show that our selection method outperforms random selection by up to 5-20

READ FULL TEXT
research
02/17/2023

A survey on online active learning

Online active learning is a paradigm in machine learning that aims to se...
research
06/24/2020

Minimum Cost Active Labeling

Labeling a data set completely is important for groundtruth generation. ...
research
11/20/2019

Active Learning for Deep Detection Neural Networks

The cost of drawing object bounding boxes (i.e. labeling) for millions o...
research
05/24/2021

Cost-Accuracy Aware Adaptive Labeling for Active Learning

Conventional active learning algorithms assume a single labeler that pro...
research
05/17/2018

Single Shot Active Learning using Pseudo Annotators

Standard myopic active learning assumes that human annotations are alway...
research
05/06/2017

PANFIS++: A Generalized Approach to Evolving Learning

The concept of evolving intelligent system (EIS) provides an effective a...
research
09/03/2015

Incremental Active Opinion Learning Over a Stream of Opinionated Documents

Applications that learn from opinionated documents, like tweets or produ...

Please sign up or login with your details

Forgot password? Click here to reset