Fast TreeSHAP: Accelerating SHAP Value Computation for Trees

09/20/2021
by   Jilei Yang, et al.
0

SHAP (SHapley Additive exPlanation) values are one of the leading tools for interpreting machine learning models, with strong theoretical guarantees (consistency, local accuracy) and a wide availability of implementations and use cases. Even though computing SHAP values takes exponential time in general, TreeSHAP takes polynomial time on tree-based models. While the speedup is significant, TreeSHAP can still dominate the computation time of industry-level machine learning solutions on datasets with millions or more entries, causing delays in post-hoc model diagnosis and interpretation service. In this paper we present two new algorithms, Fast TreeSHAP v1 and v2, designed to improve the computational efficiency of TreeSHAP for large datasets. We empirically find that Fast TreeSHAP v1 is 1.5x faster than TreeSHAP while keeping the memory cost unchanged. Similarly, Fast TreeSHAP v2 is 2.5x faster than TreeSHAP, at the cost of a slightly higher memory usage, thanks to the pre-computation of expensive TreeSHAP steps. We also show that Fast TreeSHAP v2 is well-suited for multi-time model interpretations, resulting in as high as 3x faster explanation of newly incoming samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/13/2020

Evaluating Tree Explanation Methods for Anomaly Reasoning: A Case Study of SHAP TreeExplainer and TreeInterpreter

Understanding predictions made by Machine Learning models is critical in...
research
10/27/2020

GPUTreeShap: Fast Parallel Tree Interpretability

SHAP (SHapley Additive exPlanation) values provide a game theoretic inte...
research
04/01/2021

Coalitional strategies for efficient individual prediction explanation

As Machine Learning (ML) is now widely applied in many domains, in both ...
research
02/26/2023

Fast Attention Requires Bounded Entries

In modern machine learning, inner product attention computation is a fun...
research
09/16/2022

Linear TreeShap

Decision trees are well-known due to their ease of interpretability. To ...
research
05/06/2019

Computing a Data Dividend

Quality data is a fundamental contributor to success in statistics and m...
research
10/10/2021

Quadratic Multiform Separation: A New Classification Model in Machine Learning

In this paper we present a new classification model in machine learning....

Please sign up or login with your details

Forgot password? Click here to reset