Dynamic Model Tree for Interpretable Data Stream Learning

03/30/2022
by   Johannes Haug, et al.
0

Data streams are ubiquitous in modern business and society. In practice, data streams may evolve over time and cannot be stored indefinitely. Effective and transparent machine learning on data streams is thus often challenging. Hoeffding Trees have emerged as a state-of-the art for online predictive modelling. They are easy to train and provide meaningful convergence guarantees under a stationary process. Yet, at the same time, Hoeffding Trees often require heuristic and costly extensions to adjust to distributional change, which may considerably impair their interpretability. In this work, we revisit Model Trees for machine learning in evolving data streams. Model Trees are able to maintain more flexible and locally robust representations of the active data concept, making them a natural fit for data stream applications. Our novel framework, called Dynamic Model Tree, satisfies desirable consistency and minimality properties. In experiments with synthetic and real-world tabular streaming data sets, we show that the proposed framework can drastically reduce the number of splits required by existing incremental decision trees. At the same time, our framework often outperforms state-of-the-art models in terms of predictive quality – especially when concept drift is involved. Dynamic Model Trees are thus a powerful online learning framework that contributes to more lightweight and interpretable machine learning in data streams.

READ FULL TEXT
research
10/19/2020

Learning Parameter Distributions to Detect Concept Drift in Data Streams

Data distributions in streaming environments are usually not stationary....
research
04/28/2022

Standardized Evaluation of Machine Learning Methods for Evolving Data Streams

Due to the unspecified and dynamic nature of data streams, online machin...
research
10/14/2020

Adaptive Deep Forest for Online Learning from Drifting Data Streams

Learning from data streams is among the most vital fields of contemporar...
research
01/26/2012

Dynamic trees for streaming and massive data contexts

Data collection at a massive scale is becoming ubiquitous in a wide vari...
research
02/16/2021

SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel

The development of machine learning algorithms in the cyber security dom...
research
05/06/2017

PANFIS++: A Generalized Approach to Evolving Learning

The concept of evolving intelligent system (EIS) provides an effective a...
research
11/18/2022

TensAIR: Online Learning from Data Streams via Asynchronous Iterative Routing

Online learning (OL) from data streams is an emerging area of research t...

Please sign up or login with your details

Forgot password? Click here to reset