Is Attention Interpretable?

06/09/2019
by   Sofia Serrano, et al.
0

Attention mechanisms have recently boosted performance on a range of NLP tasks. Because attention layers explicitly weight input components' representations, it is also often assumed that attention can be used to identify information that models found important (e.g., specific contextualized word tokens). We test whether that assumption holds by manipulating attention weights in already-trained text classification models and analyzing the resulting differences in their predictions. While we observe some ways in which higher attention weights correlate with greater impact on model predictions, we also find many ways in which this does not hold, i.e., where gradient-based rankings of attention weights better predict their effects than their magnitudes. We conclude that while attention noisily predicts input components' overall importance to a model, it is by no means a fail-safe indicator.

READ FULL TEXT

page 7

page 18

research
02/26/2019

Attention is not Explanation

Attention mechanisms have seen wide adoption in neural NLP models. In ad...
research
04/29/2020

Towards Transparent and Explainable Attention Models

Recent studies on interpretability of attention distributions have led t...
research
10/12/2020

Gradient-based Analysis of NLP Models is Manipulable

Gradient-based analysis methods, such as saliency map visualizations and...
research
09/17/2019

Learning to Deceive with Attention-Based Explanations

Attention mechanisms are ubiquitous components in neural architectures a...
research
08/12/2020

Text Classification based on Multi-granularity Attention Hybrid Neural Network

Neural network-based approaches have become the driven forces for Natura...
research
05/31/2023

Assessing Word Importance Using Models Trained for Semantic Tasks

Many NLP tasks require to automatically identify the most significant wo...
research
10/15/2021

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

To explain NLP models, many methods inform which inputs tokens are impor...

Please sign up or login with your details

Forgot password? Click here to reset