Confidence intervals for the random forest generalization error

12/11/2021
by   Marques F., et al.
0

We show that underneath the training process of a random forest there lies not only the well known and almost computationally free out-of-bag point estimate of its generalization error, but also a path to compute a confidence interval for the generalization error which does not demand a retraining of the forest or any forms of data splitting. Besides the low computational cost involved in its construction, this confidence interval is shown through simulations to have good coverage and appropriate shrinking rate of its width in terms of the training sample size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2018

Calculation of sample size guaranteeing the required width of the empirical confidence interval with predefined probability

The goal of any estimation study is an interval estimation of a the para...
research
03/08/2021

Forest Guided Smoothing

We use the output of a random forest to define a family of local smoothe...
research
02/24/2021

Generalised Boosted Forests

This paper extends recent work on boosting random forests to model non-G...
research
04/26/2022

Confidence Band Estimation for Survival Random Forests

Survival random forest is a popular machine learning tool for modeling c...
research
05/24/2019

HDI-Forest: Highest Density Interval Regression Forest

By seeking the narrowest prediction intervals (PIs) that satisfy the spe...
research
09/06/2021

Binomial confidence intervals for rare events: importance of defining margin of error relative to magnitude of proportion

Confidence interval performance is typically assessed in terms of two cr...
research
06/01/2023

Confidence Intervals for Error Rates in Matching Tasks: Critical Review and Recommendations

Matching algorithms are commonly used to predict matches between items i...

Please sign up or login with your details

Forgot password? Click here to reset