SHAFF: Fast and consistent SHApley eFfect estimates via random Forests

05/25/2021
by   Clément Bénard, et al.
0

Interpretability of learning algorithms is crucial for applications involving critical decisions, and variable importance is one of the main interpretation tools. Shapley effects are now widely used to interpret both tree ensembles and neural networks, as they can efficiently handle dependence and interactions in the data, as opposed to most other variable importance measures. However, estimating Shapley effects is a challenging task, because of the computational complexity and the conditional expectation estimates. Accordingly, existing Shapley algorithms have flaws: a costly running time, or a bias when input variables are dependent. Therefore, we introduce SHAFF, SHApley eFfects via random Forests, a fast and accurate Shapley effect estimate, even when input variables are dependent. We show SHAFF efficiency through both a theoretical analysis of its consistency, and the practical performance improvements over competitors with extensive experiments. An implementation of SHAFF in C++ and R is available online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2021

MDA for random forests: inconsistency, and a practical solution via the Sobol-MDA

Variable importance measures are the main tools to analyze the black-box...
research
08/07/2023

Variable importance for causal forests: breaking down the heterogeneity of treatment effects

Causal random forests provide efficient estimates of heterogeneous treat...
research
07/28/2014

Understanding Random Forests: From Theory to Practice

Data analysis and machine learning have become an integrative part of th...
research
12/06/2022

The Importance of Variable Importance

Variable importance is defined as a measure of each regressor's contribu...
research
01/18/2022

Nonparametric Feature Selection by Random Forests and Deep Neural Networks

Random forests are a widely used machine learning algorithm, but their c...
research
10/27/2015

A Framework to Adjust Dependency Measure Estimates for Chance

Estimating the strength of dependency between two variables is fundament...
research
08/31/2022

The Infinitesimal Jackknife and Combinations of Models

The Infinitesimal Jackknife is a general method for estimating variances...

Please sign up or login with your details

Forgot password? Click here to reset