Online Local Boosting: improving performance in online decision trees

As more data are produced each day, and faster, data stream mining is growing in importance, making clear the need for algorithms able to fast process these data. Data stream mining algorithms are meant to be solutions to extract knowledge online, specially tailored from continuous data problem. Many of the current algorithms for data stream mining have high processing and memory costs. Often, the higher the predictive performance, the higher these costs. To increase predictive performance without largely increasing memory and time costs, this paper introduces a novel algorithm, named Online Local Boosting (OLBoost), which can be combined into online decision tree algorithms to improve their predictive performance without modifying the structure of the induced decision trees. For such, OLBoost applies a boosting to small separate regions of the instances space. Experimental results presented in this paper show that by using OLBoost the online learning decision tree algorithms can significantly improve their predictive performance. Additionally, it can make smaller trees perform as good or better than larger trees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2018

Strict Very Fast Decision Tree: a memory conservative algorithm for data stream mining

Dealing with memory and time constraints are current challenges when lea...
research
11/18/2017

Tree-Structured Boosting: Connections Between Gradient Boosted Stumps and Full Decision Trees

Additive models, such as produced by gradient boosting, and full interac...
research
08/03/2018

Hoeffding Trees with nmin adaptation

Machine learning software accounts for a significant amount of energy co...
research
04/03/2020

Unpack Local Model Interpretation for GBDT

A gradient boosting decision tree (GBDT), which aggregates a collection ...
research
01/25/2019

Faster Boosting with Smaller Memory

The two state-of-the-art implementations of boosted trees: XGBoost and L...
research
02/10/2019

Hybrid Forest: A Concept Drift Aware Data Stream Mining Algorithm

Nowadays with a growing number of online controlling systems in the orga...
research
09/12/2021

Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection

Gradient Boosting Machines (GBM) are among the go-to algorithms on tabul...

Please sign up or login with your details

Forgot password? Click here to reset