Bayesian post-hoc regularization of random forests

06/06/2023
by   Bastian Pfeifer, et al.
0

Random Forests are powerful ensemble learning algorithms widely used in various machine learning tasks. However, they have a tendency to overfit noisy or irrelevant features, which can result in decreased generalization performance. Post-hoc regularization techniques aim to mitigate this issue by modifying the structure of the learned ensemble after its training. Here, we propose Bayesian post-hoc regularization to leverage the reliable patterns captured by leaf nodes closer to the root, while potentially reducing the impact of more specific and potentially noisy leaf nodes deeper in the tree. This approach allows for a form of pruning that does not alter the general structure of the trees but rather adjusts the influence of leaf nodes based on their proximity to the root node. We have evaluated the performance of our method on various machine learning data sets. Our approach demonstrates competitive performance with the state-of-the-art methods and, in certain cases, surpasses them in terms of predictive accuracy and generalization.

READ FULL TEXT

page 5

page 6

page 7

page 8

research
10/19/2021

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement

Random Forests (RF) are among the state-of-the-art in many machine learn...
research
06/15/2018

The Limits of Post-Selection Generalization

While statistics and machine learning offers numerous methods for ensuri...
research
07/17/2023

Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML

Automated machine learning (AutoML) systems commonly ensemble models pos...
research
06/11/2020

Deep Learning Requires Explicit Regularization for Reliable Predictive Probability

From the statistical learning perspective, complexity control via explic...
research
10/23/2018

On PAC-Bayesian Bounds for Random Forests

Existing guarantees in terms of rigorous upper bounds on the generalizat...
research
04/24/2022

Towards the Semantic Weak Generalization Problem in Generative Zero-Shot Learning: Ante-hoc and Post-hoc

In this paper, we present a simple and effective strategy lowering the p...
research
07/01/2023

CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure

Many state-of-the-art automated machine learning (AutoML) systems use gr...

Please sign up or login with your details

Forgot password? Click here to reset