Optimizing Prediction Intervals by Tuning Random Forest via Meta-Validation

01/22/2018
by   Sean Bayley, et al.
0

Recent studies have shown that tuning prediction models increases prediction accuracy and that Random Forest can be used to construct prediction intervals. However, to our best knowledge, no study has investigated the need to, and the manner in which one can, tune Random Forest for optimizing prediction intervals this paper aims to fill this gap. We explore a tuning approach that combines an effectively exhaustive search with a validation technique on a single Random Forest parameter. This paper investigates which, out of eight validation techniques, are beneficial for tuning, i.e., which automatically choose a Random Forest configuration constructing prediction intervals that are reliable and with a smaller width than the default configuration. Additionally, we present and validate three meta-validation techniques to determine which are beneficial, i.e., those which automatically chose a beneficial validation technique. This study uses data from our industrial partner (Keymind Inc.) and the Tukutuku Research Project, related to post-release defect prediction and Web application effort estimation, respectively. Results from our study indicate that: i) the default configuration is frequently unreliable, ii) most of the validation techniques, including previously successfully adopted ones such as 50/50 holdout and bootstrap, are counterproductive in most of the cases, and iii) the 75/25 holdout meta-validation technique is always beneficial; i.e., it avoids the likely counterproductive effects of validation techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2023

A Meta-analytical Comparison of Naive Bayes and Random Forest for Software Defect Prediction

Is there a statistical difference between Naive Bayes and Random Forest ...
research
01/31/2018

The Impact of Automated Parameter Optimization on Defect Prediction Models

Defect prediction models---classifiers that identify defect-prone softwa...
research
12/16/2019

A Unified Framework for Random Forest Prediction Error Estimation

We introduce a unified framework for random forest prediction error esti...
research
03/09/2021

Interpretable Machines: Constructing Valid Prediction Intervals with Random Forests

An important issue when using Machine Learning algorithms in recent rese...
research
04/10/2018

Hyperparameters and Tuning Strategies for Random Forest

The random forest algorithm (RF) has several hyperparameters that have t...
research
05/24/2018

Prediction of Autism Treatment Response from Baseline fMRI using Random Forests and Tree Bagging

Treating children with autism spectrum disorders (ASD) with behavioral i...
research
12/24/2019

ADD-Lib: Decision Diagrams in Practice

In the paper, we present the ADD-Lib, our efficient and easy to use fram...

Please sign up or login with your details

Forgot password? Click here to reset