Dimension Reduction Forests: Local Variable Importance using Structured Random Forests

03/24/2021
by   Joshua Daniel Loyal, et al.
0

Random forests are one of the most popular machine learning methods due to their accuracy and variable importance assessment. However, random forests only provide variable importance in a global sense. There is an increasing need for such assessments at a local level, motivated by applications in personalized medicine, policy-making, and bioinformatics. We propose a new nonparametric estimator that pairs the flexible random forest kernel with local sufficient dimension reduction to adapt to a regression function's local structure. This allows us to estimate a meaningful directional local variable importance measure at each prediction point. We develop a computationally efficient fitting procedure and provide sufficient conditions for the recovery of the splitting directions. We demonstrate significant accuracy gains of our proposed estimator over competing methods on simulated and real regression problems. Finally, we apply the proposed method to seasonal particulate matter concentration data collected in Beijing, China, which yields meaningful local importance measures. The methods presented here are available in the drforest Python package.

READ FULL TEXT
research
11/03/2021

From global to local MDI variable importances for random forests and when they are Shapley values

Random forests have been widely used for their ability to provide so-cal...
research
03/04/2020

Unbiased variable importance for random forests

The default variable-importance measure in random Forests, Gini importan...
research
06/15/2021

RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests

Like many predictive models, random forests provide a point prediction f...
research
01/25/2015

Prediction Error Reduction Function as a Variable Importance Score

This paper introduces and develops a novel variable importance score fun...
research
11/21/2021

Decorrelated Variable Importance

Because of the widespread use of black box prediction methods such as ra...
research
07/04/2023

MDI+: A Flexible Random Forest-Based Feature Importance Framework

Mean decrease in impurity (MDI) is a popular feature importance measure ...
research
02/06/2023

Random Forests for time-fixed and time-dependent predictors: The DynForest R package

The R package DynForest implements random forests for predicting a categ...

Please sign up or login with your details

Forgot password? Click here to reset