Asymptotic Unbiasedness of the Permutation Importance Measure in Random Forest Models

12/05/2019
by   Burim Ramosaj, et al.
0

Variable selection in sparse regression models is an important task as applications ranging from biomedical research to econometrics have shown. Especially for higher dimensional regression problems, for which the link function between response and covariates cannot be directly detected, the selection of informative variables is challenging. Under these circumstances, the Random Forest method is a helpful tool to predict new outcomes while delivering measures for variable selection. One common approach is the usage of the permutation importance. Due to its intuitive idea and flexible usage, it is important to explore circumstances, for which the permutation importance based on Random Forest correctly indicates informative covariates. Regarding the latter, we deliver theoretical guarantees for the validity of the permutation importance measure under specific assumptions and prove its (asymptotic) unbiasedness. An extensive simulation study verifies our findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2019

The revisited knockoffs method for variable selection in L1-penalised regressions

We consider the problem of variable selection in regression models. In p...
research
05/11/2020

Interpretable random forest models through forward variable selection

Random forest is a popular prediction approach for handling high dimensi...
research
06/18/2018

Variable Importance Assessments and Backward Variable Selection for High-Dimensional Data

Variable selection in high-dimensional scenarios is of great interested ...
research
09/14/2023

Statistically Valid Variable Importance Assessment through Conditional Permutations

Variable importance assessment has become a crucial step in machine-lear...
research
09/25/2020

A Feature Importance Analysis for Soft-Sensing-Based Predictions in a Chemical Sulphonation Process

In this paper we present the results of a feature importance analysis of...
research
04/05/2023

Opening the random forest black box by the analysis of the mutual impact of features

Random forest is a popular machine learning approach for the analysis of...
research
11/18/2015

A Random Forest Guided Tour

The random forest algorithm, proposed by L. Breiman in 2001, has been ex...

Please sign up or login with your details

Forgot password? Click here to reset