Towards Transparent and Explainable Attention Models

04/29/2020
by   Akash Kumar Mohankumar, et al.
0

Recent studies on interpretability of attention distributions have led to notions of faithful and plausible explanations for a model's predictions. Attention distributions can be considered a faithful explanation if a higher attention weight implies a greater impact on the model's prediction. They can be considered a plausible explanation if they provide a human-understandable justification for the model's predictions. In this work, we first explain why current attention mechanisms in LSTM based encoders can neither provide a faithful nor a plausible explanation of the model's predictions. We observe that in LSTM based encoders the hidden representations at different time-steps are very similar to each other (high conicity) and attention weights in these situations do not carry much meaning because even a random permutation of the attention weights does not affect the model's predictions. Based on experiments on a wide variety of tasks and datasets, we observe attention distributions often attribute the model's predictions to unimportant words such as punctuation and fail to offer a plausible explanation for the predictions. To make attention mechanisms more faithful and plausible, we propose a modified LSTM cell with a diversity-driven training objective that ensures that the hidden representations learned at different time steps are diverse. We show that the resulting attention distributions offer more transparency as they (i) provide a more precise importance ranking of the hidden states (ii) are better indicative of words important for the model's predictions (iii) correlate better with gradient-based attribution methods. Human evaluations indicate that the attention distributions learned by our model offer a plausible explanation of the model's predictions. Our code has been made publicly available at https://github.com/akashkm99/Interpretable-Attention

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2019

Attention is not Explanation

Attention mechanisms have seen wide adoption in neural NLP models. In ad...
research
06/02/2021

Is Sparse Attention more Interpretable?

Sparse attention has been claimed to increase model interpretability und...
research
06/09/2019

Is Attention Interpretable?

Attention mechanisms have recently boosted performance on a range of NLP...
research
06/22/2022

VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives

Many past works aim to improve visual reasoning in models by supervising...
research
10/12/2020

The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?

There is a recent surge of interest in using attention as explanation of...
research
08/13/2019

Attention is not not Explanation

Attention mechanisms play a central role in NLP systems, especially with...
research
11/23/2022

SEAT: Stable and Explainable Attention

Currently, attention mechanism becomes a standard fixture in most state-...

Please sign up or login with your details

Forgot password? Click here to reset