Shapley Explanation Networks

04/06/2021
by   Rui Wang, et al.
91

Shapley values have become one of the most popular feature attribution explanation methods. However, most prior work has focused on post-hoc Shapley explanations, which can be computationally demanding due to its exponential time complexity and preclude model regularization based on Shapley explanations during training. Thus, we propose to incorporate Shapley values themselves as latent representations in deep models thereby making Shapley explanations first-class citizens in the modeling paradigm. This intrinsic explanation approach enables layer-wise explanations, explanation regularization of the model during training, and fast explanation computation at test time. We define the Shapley transform that transforms the input into a Shapley representation given a specific function. We operationalize the Shapley transform as a neural network module and construct both shallow and deep networks, called ShapNets, by composing Shapley modules. We prove that our Shallow ShapNets compute the exact Shapley values and our Deep ShapNets maintain the missingness and accuracy properties of Shapley values. We demonstrate on synthetic and real-world datasets that our ShapNets enable layer-wise Shapley explanations, novel Shapley regularizations during training, and fast computation while maintaining reasonable performance. Code is available at https://github.com/inouye-lab/ShapleyExplanationNetworks.

READ FULL TEXT

page 7

page 23

page 24

page 25

research
03/14/2022

Rethinking Stability for Attribution-based Explanations

As attribution-based explanation methods are increasingly used to establ...
research
09/08/2022

From Shapley Values to Generalized Additive Models and back

In explainable machine learning, local post-hoc explanation algorithms a...
research
06/22/2022

OpenXAI: Towards a Transparent Evaluation of Model Explanations

While several types of post hoc explanation methods (e.g., feature attri...
research
05/29/2017

Contextual Explanation Networks

We introduce contextual explanation networks (CENs)---a class of models ...
research
03/25/2023

Learning with Explanation Constraints

While supervised learning assumes the presence of labeled data, we may h...
research
02/03/2021

When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data

Many methods now exist for conditioning model outputs on task instructio...
research
03/18/2021

Refining Neural Networks with Compositional Explanations

Neural networks are prone to learning spurious correlations from biased ...

Please sign up or login with your details

Forgot password? Click here to reset