Attention is not not Explanation

08/13/2019
by   Sarah Wiegreffe, et al.
0

Attention mechanisms play a central role in NLP systems, especially within recurrent neural network (RNN) models. Recently, there has been increasing interest in whether or not the intermediate representations offered by these modules may be used to explain the reasoning for a model's prediction, and consequently reach insights regarding the model's decision-making process. A recent paper claims that `Attention is not Explanation' (Jain and Wallace, 2019). We challenge many of the assumptions underlying this work, arguing that such a claim depends on one's definition of explanation, and that testing it needs to take into account all elements of the model, using a rigorous experimental design. We propose four alternative tests to determine when/whether attention can be used as explanation: a simple uniform-weights baseline; a variance calibration based on multiple random seed runs; a diagnostic framework using frozen weights from pretrained models; and an end-to-end adversarial attention training protocol. Each allows for meaningful interpretation of attention mechanisms in RNN models. We show that even when reliable adversarial distributions can be found, they don't perform well on the simple diagnostic, indicating that prior work does not disprove the usefulness of attention mechanisms for explainability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2022

Rethinking Attention-Model Explainability through Faithfulness Violation Test

Attention mechanisms are dominating the explainability of deep models. T...
research
05/19/2020

Staying True to Your Word: (How) Can Attention Become Explanation?

The attention mechanism has quickly become ubiquitous in NLP. In additio...
research
11/23/2022

SEAT: Stable and Explainable Attention

Currently, attention mechanism becomes a standard fixture in most state-...
research
09/24/2019

Attention Interpretability Across NLP Tasks

The attention layer in a neural network model provides insights into the...
research
04/24/2020

A Concept-based Abstraction-Aggregation Deep Neural Network for Interpretable Document Classification

Using attention weights to identify information that is important for mo...
research
04/29/2020

Towards Transparent and Explainable Attention Models

Recent studies on interpretability of attention distributions have led t...
research
04/26/2021

Attention vs non-attention for a Shapley-based explanation method

The field of explainable AI has recently seen an explosion in the number...

Please sign up or login with your details

Forgot password? Click here to reset