Dynamic Ensemble Size Adjustment for Memory Constrained Mondrian Forest

10/11/2022
by   Martin Khannouz, et al.
0

Supervised learning algorithms generally assume the availability of enough memory to store data models during the training and test phases. However, this assumption is unrealistic when data comes in the form of infinite data streams, or when learning algorithms are deployed on devices with reduced amounts of memory. Such memory constraints impact the model behavior and assumptions. In this paper, we show that under memory constraints, increasing the size of a tree-based ensemble classifier can worsen its performance. In particular, we experimentally show the existence of an optimal ensemble size for a memory-bounded Mondrian forest on data streams and we design an algorithm to guide the forest toward that optimal number by using an estimation of overfitting. We tested different variations for this algorithm on a variety of real and simulated datasets, and we conclude that our method can achieve up to 95 datasets, and can even outperform it for datasets with concept drifts. All our methods are implemented in the OrpailleCC open-source library and are ready to be used on embedded systems and connected objects.

READ FULL TEXT
research
05/12/2022

Mondrian Forest for Data Stream Classification Under Memory Constraints

Supervised learning algorithms generally assume the availability of enou...
research
01/21/2021

Crossbreeding in Random Forest

Ensemble learning methods are designed to benefit from multiple learning...
research
05/18/2020

Optimal survival trees ensemble

Recent studies have adopted an approach of selecting accurate and divers...
research
10/25/2022

Eigen Memory Tree

This work introduces the Eigen Memory Tree (EMT), a novel online memory ...
research
12/10/2017

A General Memory-Bounded Learning Algorithm

In an era of big data there is a growing need for memory-bounded learnin...
research
08/09/2019

Adaptive Ensemble of Classifiers with Regularization for Imbalanced Data Classification

Dynamic ensembling of classifiers is an effective approach in processing...
research
05/14/2019

Resource-aware Elastic Swap Random Forest for Evolving Data Streams

Continual learning based on data stream mining deals with ubiquitous sou...

Please sign up or login with your details

Forgot password? Click here to reset