Active learning for binary classification with variable selection

01/29/2019
by   Zhanfeng Wang, et al.
0

Modern computing and communication technologies can make data collection procedures very efficient. However, our ability to analyze large data sets and/or to extract information out from them is hard-pressed to keep up with our capacities for data collection. Among these huge data sets, some of them are not collected for any particular research purpose. For a classification problem, this means that the essential label information may not be readily obtainable, in the data set in hands, and an extra labeling procedure is required such that we can have enough label information to be used for constructing a classification model. When the size of a data set is huge, to label each subject in it will cost a lot in both capital and time. Thus, it is an important issue to decide which subjects should be labeled first in order to efficiently reduce the training cost/time. Active learning method is a promising outlet for this situation, because with the active learning ideas, we can select the unlabeled subjects sequentially without knowing their label information. In addition, there will be no confirmed information about the essential variables for constructing an efficient classification rule. Thus, how to merge a variable selection scheme with an active learning procedure is of interest. In this paper, we propose a procedure for building binary classification models when the complete label information is not available in the beginning of the training stage. We study an model-based active learning procedure with sequential variable selection schemes, and discuss the results of the proposed procedure from both theoretical and numerical aspects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2018

Greedy Active Learning Algorithm for Logistic Regression Models

We study a logistic model-based active learning procedure for binary cla...
research
06/24/2020

Minimum Cost Active Labeling

Labeling a data set completely is important for groundtruth generation. ...
research
06/14/2019

Online Active Learning of Reject Option Classifiers

Active learning is an important technique to reduce the number of labele...
research
05/07/2022

Determination of class-specific variables in nonparametric multiple-class classification

As technology advanced, collecting data via automatic collection devices...
research
03/09/2022

Reinforced Meta Active Learning

In stream-based active learning, the learning procedure typically has ac...
research
02/18/2020

Active Learning-based Classification in Automated Connected Vehicles

Machine learning has emerged as a promising paradigm for enabling connec...
research
02/26/2021

Active Selection of Classification Features

Some data analysis applications comprise datasets, where explanatory var...

Please sign up or login with your details

Forgot password? Click here to reset