An Eager Splitting Strategy for Online Decision Trees

10/20/2020
by   Chaitanya Manapragada, et al.
0

We study the effectiveness of replacing the split strategy for the state-of-the-art online tree learner, Hoeffding Tree, with a rigorous but more eager splitting strategy. Our method, Hoeffding AnyTime Tree (HATT), uses the Hoeffding Test to determine whether the current best candidate split is superior to the current split, with the possibility of revision, while Hoeffding Tree aims to determine whether the top candidate is better than the second best and fixes it for all posterity. Our method converges to the ideal batch tree while Hoeffding Tree does not. Decision tree ensembles are widely used in practice, and in this work, we study the efficacy of HATT as a base learner for online bagging and online boosting ensembles. On UCI and synthetic streams, the success of Hoeffding AnyTime Tree in terms of prequential accuracy over Hoeffding Tree is established. HATT as a base learner component outperforms HT within a 0.05 significance level for the majority of tested ensembles on what we believe is the largest and most comprehensive set of testbenches in the online learning literature. Our results indicate that HATT is a superior alternative to Hoeffding Tree in a large number of ensemble settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2023

On Computing Optimal Tree Ensembles

Random forests and, more generally, (decision-)tree ensembles are widely...
research
04/12/2016

Confidence Decision Trees via Online and Active Learning for Streaming (BIG) Data

Decision tree classifiers are a widely used tool in data stream mining. ...
research
12/07/2021

Shrub Ensembles for Online Classification

Online learning algorithms have become a ubiquitous tool in the machine ...
research
09/10/2021

A Neural Tangent Kernel Perspective of Infinite Tree Ensembles

In practical situations, the ensemble tree model is one of the most popu...
research
01/24/2023

A Robust Hypothesis Test for Tree Ensemble Pruning

Gradient boosted decision trees are some of the most popular algorithms ...
research
11/30/2020

Using dynamical quantization to perform split attempts in online tree regressors

A central aspect of online decision tree solutions is evaluating the inc...
research
11/14/2016

Splitting matters: how monotone transformation of predictor variables may improve the predictions of decision tree models

It is widely believed that the prediction accuracy of decision tree mode...

Please sign up or login with your details

Forgot password? Click here to reset