Minimum Cost Active Labeling

06/24/2020
by   Hang Qiu, et al.
5

Labeling a data set completely is important for groundtruth generation. In this paper, we consider the problem of minimum-cost labeling: classifying all images in a large data set with a target accuracy bound at minimum dollar cost. Human labeling can be prohibitive, so we train a classifier to accurately label part of the data set. However, training the classifier can be expensive too, particularly with active learning. Our min-cost labeling uses a variant of active learning to learn a model to predict the optimal training set size for the classifier that minimizes overall cost, then uses active learning to train the classifier to maximize the number of samples the classifier can correctly label. We validate our approach on well-known public data sets such as Fashion, CIFAR-10, and CIFAR-100. In some cases, our approach has 6X lower overall cost relative to human labeling, and is always cheaper than the cheapest active learning strategy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2021

Cost-Accuracy Aware Adaptive Labeling for Active Learning

Conventional active learning algorithms assume a single labeler that pro...
research
12/31/2018

Cluster-Based Active Learning

In this work, we introduce Cluster-Based Active Learning, a novel framew...
research
09/08/2018

Cost-Sensitive Active Learning for Intracranial Hemorrhage Detection

Deep learning for clinical applications is subject to stringent performa...
research
01/29/2019

Active learning for binary classification with variable selection

Modern computing and communication technologies can make data collection...
research
01/25/2022

Online Active Learning with Dynamic Marginal Gain Thresholding

The blessing of ubiquitous data also comes with a curse: the communicati...
research
06/17/2019

Active Learning by Greedy Split and Label Exploration

Annotating large unlabeled datasets can be a major bottleneck for machin...
research
04/01/2018

SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data

State-of-the-art techniques of artificial intelligence, in particular de...

Please sign up or login with your details

Forgot password? Click here to reset