Understanding and Unifying Fourteen Attribution Methods with Taylor Interactions

03/02/2023
by   Huiqi Deng, et al.
0

Various attribution methods have been developed to explain deep neural networks (DNNs) by inferring the attribution/importance/contribution score of each input variable to the final output. However, existing attribution methods are often built upon different heuristics. There remains a lack of a unified theoretical understanding of why these methods are effective and how they are related. To this end, for the first time, we formulate core mechanisms of fourteen attribution methods, which were designed on different heuristics, into the same mathematical system, i.e., the system of Taylor interactions. Specifically, we prove that attribution scores estimated by fourteen attribution methods can all be reformulated as the weighted sum of two types of effects, i.e., independent effects of each individual input variable and interaction effects between input variables. The essential difference among the fourteen attribution methods mainly lies in the weights of allocating different effects. Based on the above findings, we propose three principles for a fair allocation of effects to evaluate the faithfulness of the fourteen attribution methods.

READ FULL TEXT

page 1

page 2

page 4

page 9

page 15

page 16

page 17

research
05/28/2021

A General Taylor Framework for Unifying and Revisiting Attribution Methods

Attribution methods provide an insight into the decision-making process ...
research
11/16/2017

A unified view of gradient-based attribution methods for Deep Neural Networks

Understanding the flow of information in Deep Neural Networks is a chall...
research
12/12/2022

Utilizing Mutations to Evaluate Interpretability of Neural Networks on Genomic Data

Even though deep neural networks (DNNs) achieve state-of-the-art results...
research
05/16/2023

The Weighted Möbius Score: A Unified Framework for Feature Attribution

Feature attribution aims to explain the reasoning behind a black-box mod...
research
04/04/2023

HarsanyiNet: Computing Accurate Shapley Values in a Single Forward Propagation

The Shapley value is widely regarded as a trustworthy attribution metric...
research
12/14/2018

Inferring the size of the causal universe: features and fusion of causal attribution networks

Cause-and-effect reasoning, the attribution of effects to causes, is one...
research
01/17/2023

Negative Flux Aggregation to Estimate Feature Attributions

There are increasing demands for understanding deep neural networks' (DN...

Please sign up or login with your details

Forgot password? Click here to reset