Cost-complexity pruning of random forests

03/15/2017
by   Kiran Bangalore Ravi, et al.
0

Random forests perform bootstrap-aggregation by sampling the training samples with replacement. This enables the evaluation of out-of-bag error which serves as a internal cross-validation mechanism. Our motivation lies in using the unsampled training samples to improve each decision tree in the ensemble. We study the effect of using the out-of-bag samples to improve the generalization error first of the decision trees and second the random forest by post-pruning. A preliminary empirical study on four UCI repository datasets show consistent decrease in the size of the forests without considerable loss in accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2018

On PAC-Bayesian Bounds for Random Forests

Existing guarantees in terms of rigorous upper bounds on the generalizat...
research
11/22/2020

Fairness-guided SMT-based Rectification of Decision Trees and Random Forests

Data-driven decision making is gaining prominence with the popularity of...
research
12/30/2020

Optimal trees selection for classification via out-of-bag assessment and sub-bagging

The effect of training data size on machine learning methods has been we...
research
02/19/2018

Finding Influential Training Samples for Gradient Boosted Decision Trees

We address the problem of finding influential training samples for a par...
research
04/05/2020

XtracTree for Regulator Validation of Bagging Methods Used in Retail Banking

Bootstrap aggregation, known as bagging, is one of the most popular ense...
research
01/05/2023

Random forests, sound symbolism and Pokemon evolution

This study constructs machine learning algorithms that are trained to cl...
research
03/04/2020

Unbiased variable importance for random forests

The default variable-importance measure in random Forests, Gini importan...

Please sign up or login with your details

Forgot password? Click here to reset