Online Feature Screening for Data Streams with Concept Drift

04/07/2021
by   Mingyuan Wang, et al.
0

Screening feature selection methods are often used as a preprocessing step for reducing the number of variables before training step. Traditional screening methods only focus on dealing with complete high dimensional datasets. Modern datasets not only have higher dimension and larger sample size, but also have properties such as streaming input, sparsity and concept drift. Therefore a considerable number of online feature selection methods were introduced to handle these kind of problems in recent years. Online screening methods are one of the categories of online feature selection methods. The methods that we proposed in this research are capable of handling all three situations mentioned above. Our research study focuses on classification datasets. Our experiments show proposed methods can generate the same feature importance as their offline version with faster speed and less storage consumption. Furthermore, the results show that online screening methods with integrated model adaptation have a higher true feature detection rate than without model adaptation on data streams with the concept drift property. Among the two large real datasets that potentially have the concept drift property, online screening methods with model adaptation show advantages in either saving computing time and space, reducing model complexity, or improving prediction accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2022

Deep Feature Screening: Feature Selection for Ultra High-Dimensional Data via Deep Neural Networks

The applications of traditional statistical feature selection methods to...
research
09/14/2018

Are screening methods useful in feature selection? An empirical study

Filter or screening methods are often used as a preprocessing step for r...
research
12/15/2021

Online Feature Selection for Efficient Learning in Networked Systems

Current AI/ML methods for data-driven engineering use models that are mo...
research
02/24/2015

On the consistency theory of high dimensional variable screening

Variable screening is a fast dimension reduction technique for assisting...
research
03/16/2023

A Multimodal Data-driven Framework for Anxiety Screening

Early screening for anxiety and appropriate interventions are essential ...
research
08/12/2022

SFF-DA: Sptialtemporal Feature Fusion for Detecting Anxiety Nonintrusively

Early detection of anxiety disorders is essential to reduce the sufferin...
research
01/20/2020

An Efficient Framework for Automated Screening of Clinically Significant Macular Edema

The present study proposes a new approach to automated screening of Clin...

Please sign up or login with your details

Forgot password? Click here to reset