Decorrelated Variable Importance

11/21/2021
by   Isabella Verdinelli, et al.
5

Because of the widespread use of black box prediction methods such as random forests and neural nets, there is renewed interest in developing methods for quantifying variable importance as part of the broader goal of interpretable prediction. A popular approach is to define a variable importance parameter - known as LOCO (Leave Out COvariates) - based on dropping covariates from a regression model. This is essentially a nonparametric version of R-squared. This parameter is very general and can be estimated nonparametrically, but it can be hard to interpret because it is affected by correlation between covariates. We propose a method for mitigating the effect of correlation by defining a modified version of LOCO. This new parameter is difficult to estimate nonparametrically, but we show how to estimate it using semiparametric models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2023

Feature Importance: A Closer Look at Shapley Values and LOCO

There is much interest lately in explainability in statistics and machin...
research
12/20/2022

A Generalized Variable Importance Metric and Estimator for Black Box Machine Learning Models

The aim of this study is to define importance of predictors for black bo...
research
04/06/2021

Variable selection with missing data in both covariates and outcomes: Imputation and machine learning

The missing data issue is ubiquitous in health studies. Variable selecti...
research
03/24/2021

Dimension Reduction Forests: Local Variable Importance using Structured Random Forests

Random forests are one of the most popular machine learning methods due ...
research
12/26/2018

Comparing Spatial Regression to Random Forests for Large Environmental Data Sets

Environmental data may be "large" due to number of records, number of co...
research
04/07/2020

A unified approach for inference on algorithm-agnostic variable importance

In many applications, it is of interest to assess the relative contribut...
research
09/15/2018

Omitted and Included Variable Bias in Tests for Disparate Impact

Policymakers often seek to gauge discrimination against groups defined b...

Please sign up or login with your details

Forgot password? Click here to reset