Discretized Integrated Gradients for Explaining Language Models

08/31/2021
by   Soumya Sanyal, et al.
0

As a prominent attribution-based explanation algorithm, Integrated Gradients (IG) is widely adopted due to its desirable explanation axioms and the ease of gradient computation. It measures feature importance by averaging the model's output gradient interpolated along a straight-line path in the input data space. However, such straight-line interpolated points are not representative of text data due to the inherent discreteness of the word embedding space. This questions the faithfulness of the gradients computed at the interpolated points and consequently, the quality of the generated explanations. Here we propose Discretized Integrated Gradients (DIG), which allows effective attribution along non-linear interpolation paths. We develop two interpolation strategies for the discrete word embedding space that generates interpolation points that lie close to actual words in the embedding space, yielding more faithful gradient computation. We demonstrate the effectiveness of DIG over IG through experimental and human evaluations on multiple sentiment classification datasets. We provide the source code of DIG to encourage reproducible research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2023

Sequential Integrated Gradients: a simple but effective method for explaining language models

Several explanation methods such as Integrated Gradients (IG) can be cha...
research
09/27/2017

Case Study: Explaining Diabetic Retinopathy Detection Deep CNNs via Integrated Gradients

In this report, we applied integrated gradients to explaining a neural n...
research
05/31/2023

Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision

Attribution algorithms are frequently employed to explain the decisions ...
research
06/15/2022

The Manifold Hypothesis for Gradient-Based Explanations

When do gradient-based explanation algorithms provide meaningful explana...
research
05/11/2018

State Gradients for RNN Memory Analysis

We present a framework for analyzing what the state in RNNs remembers fr...
research
06/29/2022

Teach me how to Interpolate a Myriad of Embeddings

Mixup refers to interpolation-based data augmentation, originally motiva...
research
09/04/2019

Generalized Integrated Gradients: A practical method for explaining diverse ensembles

We introduce Generalized Integrated Gradients (GIG), a formal extension ...

Please sign up or login with your details

Forgot password? Click here to reset