Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision

05/31/2023
by   Chase Walker, et al.
1

Attribution algorithms are frequently employed to explain the decisions of neural network models. Integrated Gradients (IG) is an influential attribution method due to its strong axiomatic foundation. The algorithm is based on integrating the gradients along a path from a reference image to the input image. Unfortunately, it can be observed that gradients computed from regions where the output logit changes minimally along the path provide poor explanations for the model decision, which is called the saturation effect problem. In this paper, we propose an attribution algorithm called integrated decision gradients (IDG). The algorithm focuses on integrating gradients from the region of the path where the model makes its decision, i.e., the portion of the path where the output logit rapidly transitions from zero to its final value. This is practically realized by scaling each gradient by the derivative of the output logit with respect to the path. The algorithm thereby provides a principled solution to the saturation problem. Additionally, we minimize the errors within the Riemann sum approximation of the path integral by utilizing non-uniform subdivisions determined by adaptive sampling. In the evaluation on ImageNet, it is demonstrated that IDG outperforms IG, left-IG, guided IG, and adversarial gradient integration both qualitatively and quantitatively using standard insertion and deletion metrics across three common models.

READ FULL TEXT

page 3

page 5

page 9

page 14

page 15

page 16

page 17

page 18

research
10/23/2020

Investigating Saturation Effects in Integrated Gradients

Integrated Gradients has become a popular method for post-hoc model inte...
research
06/17/2021

Guided Integrated Gradients: An Adaptive Path Method for Removing Noise

Integrated Gradients (IG) is a commonly used feature attribution method ...
research
07/08/2020

An exploration of the influence of path choice in game-theoretic attribution algorithms

We compare machine learning explainability methods based on the theory o...
research
08/31/2021

Discretized Integrated Gradients for Explaining Language Models

As a prominent attribution-based explanation algorithm, Integrated Gradi...
research
02/22/2023

Non-Uniform Interpolation in Integrated Gradients for Low-Latency Explainable-AI

There has been a surge in Explainable-AI (XAI) methods that provide insi...
research
06/13/2022

Geometrically Guided Integrated Gradients

Interpretability methods for deep neural networks mainly focus on the se...
research
10/14/2020

Learning Propagation Rules for Attribution Map Generation

Prior gradient-based attribution-map methods rely on handcrafted propaga...

Please sign up or login with your details

Forgot password? Click here to reset