From global to local MDI variable importances for random forests and when they are Shapley values

11/03/2021
by   Antonio Sutera, et al.
7

Random forests have been widely used for their ability to provide so-called importance measures, which give insight at a global (per dataset) level on the relevance of input variables to predict a certain output. On the other hand, methods based on Shapley values have been introduced to refine the analysis of feature relevance in tree-based models to a local (per instance) level. In this context, we first show that the global Mean Decrease of Impurity (MDI) variable importance scores correspond to Shapley values under some conditions. Then, we derive a local MDI importance measure of variable relevance, which has a very natural connection with the global MDI measure and can be related to a new notion of local feature relevance. We further link local MDI importances with Shapley values and discuss them in the light of related measures from the literature. The measures are illustrated through experiments on several classification and regression problems.

READ FULL TEXT

page 9

page 15

research
03/24/2021

Dimension Reduction Forests: Local Variable Importance using Structured Random Forests

Random forests are one of the most popular machine learning methods due ...
research
08/13/2021

Data-driven advice for interpreting local and global model predictions in bioinformatics problems

Tree-based algorithms such as random forests and gradient boosted trees ...
research
06/24/2021

On Locality of Local Explanation Models

Shapley values provide model agnostic feature attributions for model out...
research
06/04/2019

Fréchet random forests

Random forests are a statistical learning method widely used in many are...
research
05/12/2016

Context-dependent feature analysis with random forests

In many cases, feature selection is often more complicated than identify...
research
12/13/2019

Understanding complex predictive models with Ghost Variables

We propose a procedure for assigning a relevance measure to each explana...

Please sign up or login with your details

Forgot password? Click here to reset