Concept-drifting Data Streams are Time Series; The Case for Continuous Adaptation

10/04/2018
by   Jesse Read, et al.
0

Learning from data streams is an increasingly important topic in data mining, machine learning, and artificial intelligence in general. A major focus in the data stream literature is on designing methods that can deal with concept drift, a challenge where the generating distribution changes over time. A general assumption in most of this literature is that instances are independently distributed in the stream. In this work we show that, in the context of concept drift, this assumption is contradictory, and that the presence of concept drift necessarily implies temporal dependence; and thus some form of time series. This has important implications on model design and deployment. We explore and highlight the these implications, and show that Hoeffding-tree based ensembles, which are very popular for learning in streams, are not naturally suited to learning within drift; and can perform in this scenario only at significant computational cost of destructive adaptation. On the other hand, we develop and parameterize gradient-descent methods and demonstrate how they can perform continuous adaptation with no explicit drift-detection mechanism, offering major advantages in terms of accuracy and efficiency. As a consequence of our theoretical discussion and empirical observations, we outline a number of recommendations for deploying methods in concept-drifting streams.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2020

Adversarial Concept Drift Detection under Poisoning Attacks for Robust Data Stream Mining

Continuous learning from streaming data is among the most challenging to...
research
12/30/2022

Learning from Data Streams: An Overview and Update

The literature on machine learning in the context of data streams is vas...
research
04/24/2017

Learning from Ontology Streams with Semantic Concept Drift

Data stream learning has been largely studied for extracting knowledge s...
research
03/14/2023

On the Connection between Concept Drift and Uncertainty in Industrial Artificial Intelligence

AI-based digital twins are at the leading edge of the Industry 4.0 revol...
research
10/10/2022

A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification

In real-world applications, the process generating the data might suffer...
research
06/14/2021

Automated Machine Learning Techniques for Data Streams

Automated machine learning techniques benefited from tremendous research...
research
01/14/2021

Analysis of hidden feedback loops in continuous machine learning systems

In this concept paper, we discuss intricacies of specifying and verifyin...

Please sign up or login with your details

Forgot password? Click here to reset