Resource-aware Elastic Swap Random Forest for Evolving Data Streams

05/14/2019
by   Diego Marrón, et al.
0

Continual learning based on data stream mining deals with ubiquitous sources of Big Data arriving at high-velocity and in real-time. Adaptive Random Forest ( ARF) is a popular ensemble method used for continual learning due to its simplicity in combining adaptive leveraging bagging with fast random Hoeffding trees. While the default ARF size provides competitive accuracy, it is usually over-provisioned resulting in the use of additional classifiers that only contribute to increasing CPU and memory consumption with marginal impact in the overall accuracy. This paper presents Elastic Swap Random Forest ( ESRF), a method for reducing the number of trees in the ARF ensemble while providing similar accuracy. ESRF extends ARF with two orthogonal components: 1) a swap component that splits learners into two sets based on their accuracy (only classifiers with the highest accuracy are used to make predictions); and 2) an elastic component for dynamically increasing or decreasing the number of classifiers in the ensemble. The experimental evaluation of ESRF and comparison with the original ARF shows how the two new components contribute to reducing the number of classifiers up to one third while providing almost the same accuracy, resulting in speed-ups in terms of per-sample execution time close to 3x.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2016

Random Forest Based Approach for Concept Drift Handling

Concept drift has potential in smart grid analysis because the socio-eco...
research
03/19/2019

Random Pairwise Shapelets Forest

Shapelet is a discriminative subsequence of time series. An advanced sha...
research
11/27/2022

Neural Architecture for Online Ensemble Continual Learning

Continual learning with an increasing number of classes is a challenging...
research
04/06/2020

FastForest: Increasing Random Forest Processing Speed While Maintaining Accuracy

Random Forest remains one of Data Mining's most enduring ensemble algori...
research
09/09/2017

Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers

The number of component classifiers chosen for an ensemble has a great i...
research
08/25/2017

Accurate parameter estimation for Bayesian Network Classifiers using Hierarchical Dirichlet Processes

This paper introduces a novel parameter estimation method for the probab...
research
10/11/2022

Dynamic Ensemble Size Adjustment for Memory Constrained Mondrian Forest

Supervised learning algorithms generally assume the availability of enou...

Please sign up or login with your details

Forgot password? Click here to reset