Stopping criterion for active learning based on deterministic generalization bounds

05/15/2020
by   Hideaki Ishibashi, et al.
0

Active learning is a framework in which the learning machine can select the samples to be used for training. This technique is promising, particularly when the cost of data acquisition and labeling is high. In active learning, determining the timing at which learning should be stopped is a critical issue. In this study, we propose a criterion for automatically stopping active learning. The proposed stopping criterion is based on the difference in the expected generalization errors and hypothesis testing. We derive a novel upper bound for the difference in expected generalization errors before and after obtaining a new training datum based on PAC-Bayesian theory. Unlike ordinary PAC-Bayesian bounds, though, the proposed bound is deterministic; hence, there is no uncontrollable trade-off between the confidence and tightness of the inequality. We combine the upper bound with a statistical test to derive a stopping criterion for active learning. We demonstrate the effectiveness of the proposed method via experiments with both artificial and real datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2021

Stopping Criterion for Active Learning Based on Error Stability

Active learning is a framework for supervised learning to improve the pr...
research
02/27/2017

Diameter-Based Active Learning

To date, the tightest upper and lower-bounds for the active learning of ...
research
03/06/2018

Multi-class Active Learning: A Hybrid Informative and Representative Criterion Inspired Approach

Labeling each instance in a large dataset is extremely labor- and time- ...
research
04/23/2015

Analysis of Stopping Active Learning based on Stabilizing Predictions

Within the natural language processing (NLP) community, active learning ...
research
10/07/2021

Hitting the Target: Stopping Active Learning at the Cost-Based Optimum

Active learning allows machine learning models to be trained using fewer...
research
01/14/2021

New bounds for k-means and information k-means

In this paper, we derive a new dimension-free non-asymptotic upper bound...
research
12/10/2016

Active Learning for Speech Recognition: the Power of Gradients

In training speech recognition systems, labeling audio clips can be expe...

Please sign up or login with your details

Forgot password? Click here to reset