A Variance Maximization Criterion for Active Learning

06/23/2017
by   Yazhou Yang, et al.
0

Active learning aims to train a classifier as fast as possible with as few labels as possible. The core element in virtually any active learning strategy is the criterion that measures the usefulness of the unlabeled data. We propose a novel approach which we refer to as maximizing variance for active learning or MVAL for short. MVAL measures the value of unlabeled instances by evaluating the rate of change of output variables caused by changes in the next sample to be queried and its potential labelling. In a sense, this criterion measures how unstable the classifier's output is for the unlabeled data points under perturbations of the training data. MVAL maintains, what we will refer to as, retraining information matrices to keep track of these output scores and exploits two kinds of variance to measure the informativeness and representativeness, respectively. By fusing these variances, MVAL is able to select the instances which are both informative and representative. We employ our technique both in combination with logistic regression and support vector machines and demonstrate that MVAL achieves state-of-the-art performance in experiments on a large number of standard benchmark datasets.

READ FULL TEXT

page 3

page 6

research
07/29/2021

Semi-Supervised Active Learning with Temporal Output Discrepancy

While deep learning succeeds in a wide range of tasks, it highly depends...
research
04/14/2019

Exploring Representativeness and Informativeness for Active Learning

How can we find a general way to choose the most suitable samples for tr...
research
12/20/2022

Temporal Output Discrepancy for Loss Estimation-based Active Learning

While deep learning succeeds in a wide range of tasks, it highly depends...
research
02/18/2020

Information Condensing Active Learning

We introduce Information Condensing Active Learning (ICAL), a batch mode...
research
04/06/2021

Low-Regret Active learning

We develop an online learning algorithm for identifying unlabeled data p...
research
12/06/2018

Active Learning Methods based on Statistical Leverage Scores

In many real-world machine learning applications, unlabeled data are abu...
research
05/19/2017

Data-adaptive Active Sampling for Efficient Graph-Cognizant Classification

The present work deals with active sampling of graph nodes representing ...

Please sign up or login with your details

Forgot password? Click here to reset