Concept Drift Detection for Streaming Data

04/04/2015
by   Heng Wang, et al.
0

Common statistical prediction models often require and assume stationarity in the data. However, in many practical applications, changes in the relationship of the response and predictor variables are regularly observed over time, resulting in the deterioration of the predictive performance of these models. This paper presents Linear Four Rates (LFR), a framework for detecting these concept drifts and subsequently identifying the data points that belong to the new concept (for relearning the model). Unlike conventional concept drift detection approaches, LFR can be applied to both batch and stream data; is not limited by the distribution properties of the response variable (e.g., datasets with imbalanced labels); is independent of the underlying statistical-model; and uses user-specified parameters that are intuitively comprehensible. The performance of LFR is compared to benchmark approaches using both simulated and commonly used public datasets that span the gamut of concept drift types. The results show LFR significantly outperforms benchmark approaches in terms of recall, accuracy and delay in detection of concept drifts across datasets.

READ FULL TEXT

page 5

page 6

research
07/25/2017

Concept Drift Detection and Adaptation with Hierarchical Hypothesis Testing

In a streaming environment, there is often a need for statistical predic...
research
07/28/2020

Diagnosing Concept Drift with Visual Analytics

Concept drift is a phenomenon in which the distribution of a data stream...
research
05/06/2023

Detecting Concept Drift for the reliability prediction of Software Defects using Instance Interpretation

In the context of Just-In-Time Software Defect Prediction (JIT-SDP), Con...
research
05/31/2022

Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees

The statistical characteristics of instance-label pairs often change wit...
research
12/25/2012

Exponentially Weighted Moving Average Charts for Detecting Concept Drift

Classifying streaming data requires the development of methods which are...
research
05/19/2023

OPTWIN: Drift identification with optimal sub-windows

Online Learning (OL) is a field of research that is increasingly gaining...
research
02/02/2021

Drift Estimation with Graphical Models

This paper deals with the issue of concept drift in supervised machine l...

Please sign up or login with your details

Forgot password? Click here to reset