ggRandomForests: Visually Exploring a Random Forest for Regression

01/28/2015
by   John Ehrlinger, et al.
0

Random Forests [Breiman:2001] (RF) are a fully non-parametric statistical method requiring no distributional assumptions on covariate relation to the response. RF are a robust, nonlinear technique that optimizes predictive accuracy by fitting an ensemble of trees to stabilize model estimates. The randomForestSRC package (http://cran.r-project.org/package=randomForestSRC) is a unified treatment of Breiman's random forests for survival, regression and classification problems. Predictive accuracy make RF an attractive alternative to parametric models, though complexity and interpretability of the forest hinder wider application of the method. We introduce the ggRandomForests package (http://cran.r-project.org/package=ggRandomForests), for visually understand random forest models grown in R with the randomForestSRC package. The vignette is a tutorial for using the ggRandomForests package with the randomForestSRC package for building and post-processing a regression random forest. In this tutorial, we explore a random forest model for the Boston Housing Data, available in the MASS package. We grow a random forest for regression and demonstrate how ggRandomForests can be used when determining variable associations, interactions and how the response depends on predictive variables within the model. The tutorial demonstrates the design and usage of many of ggRandomForests functions and features how to modify and customize the resulting ggplot2 graphic objects along the way. A development version of the ggRandomForests package is available on Github. We invite comments, feature requests and bug reports for this package at (https://github.com/ehrlinger/ggRandomForests).

READ FULL TEXT

page 15

page 28

research
12/28/2016

ggRandomForests: Exploring Random Forest Survival

Random forest (Leo Breiman 2001a) (RF) is a non-parametric statistical m...
research
07/19/2018

A Projection Pursuit Forest Algorithm for Supervised Classification

This paper presents a new ensemble learning method for classification pr...
research
07/04/2023

MDI+: A Flexible Random Forest-Based Feature Importance Framework

Mean decrease in impurity (MDI) is a popular feature importance measure ...
research
04/16/2018

RFCDE: Random Forests for Conditional Density Estimation

Random forests is a common non-parametric regression technique which per...
research
05/29/2020

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

We propose an adaptation of the Random Forest algorithm to estimate the ...
research
04/23/2019

Regression-Enhanced Random Forests

Random forest (RF) methodology is one of the most popular machine learni...
research
05/30/2016

Forest Floor Visualizations of Random Forests

We propose a novel methodology, forest floor, to visualize and interpret...

Please sign up or login with your details

Forgot password? Click here to reset