Automated Machine Learning Techniques for Data Streams

06/14/2021
by   Alexandru-Ionut Imbrea, et al.
0

Automated machine learning techniques benefited from tremendous research progress in recently. These developments and the continuous-growing demand for machine learning experts led to the development of numerous AutoML tools. However, these tools assume that the entire training dataset is available upfront and that the underlying distribution does not change over time. These assumptions do not hold in a data stream mining setting where an unbounded stream of data cannot be stored and is likely to manifest concept drift. Industry applications of machine learning on streaming data become more popular due to the increasing adoption of real-time streaming patterns in IoT, microservices architectures, web analytics, and other fields. The research summarized in this paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time. For comparative purposes, batch, batch incremental and instance incremental estimators are applied and compared. Moreover, a meta-learning technique for online algorithm selection based on meta-feature extraction is proposed and compared while model replacement and continual AutoML techniques are discussed. The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2020

A Novel Incremental Clustering Technique with Concept Drift Detection

Data are being collected from various aspects of life. These data can of...
research
08/17/2020

scikit-dyn2sel – A Dynamic Selection Framework for Data Streams

Mining data streams is a challenge per se. It must be ready to deal with...
research
09/20/2020

Instance exploitation for learning temporary concepts from sparsely labeled drifting data streams

Continual learning from streaming data sources becomes more and more pop...
research
11/17/2019

Rebalancing Learning on Evolving Data Streams

Nowadays, every device connected to the Internet generates an ever-growi...
research
10/04/2018

Concept-drifting Data Streams are Time Series; The Case for Continuous Adaptation

Learning from data streams is an increasingly important topic in data mi...
research
01/09/2023

On the challenges to learn from Natural Data Streams

In real-world contexts, sometimes data are available in form of Natural ...
research
09/05/2022

Incremental Permutation Feature Importance (iPFI): Towards Online Explanations on Data Streams

Explainable Artificial Intelligence (XAI) has mainly focused on static l...

Please sign up or login with your details

Forgot password? Click here to reset