Why is Attention Not So Attentive?

06/10/2020
by   Bing Bai, et al.
30

Attention-based methods have played an important role in model interpretations, where the calculated attention weights are expected to highlight the critical parts of inputs (e.g., keywords in sentences). However, some recent research points out that attention-as-importance interpretations often do not work as well as we expect. For example, learned attention weights are frequently uncorrelated with other feature importance indicators like gradient-based measures, and a debate on the effectiveness of attention-based interpretations has also raised. In this paper, we reveal that one root cause of this phenomenon can be ascribed to the combinatorial shortcuts, which stand for that the models may not only obtain information from the highlighted parts by attention mechanisms but from the attention weights themselves. We design one intuitive experiment to demonstrate the existence of combinatorial shortcuts and propose two methods to mitigate this issue. Empirical studies on attention-based instance-wise feature selection interpretation models are conducted, and the results show that the proposed methods can effectively improve the interpretability of attention mechanisms on a variety of datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2022

Attention cannot be an Explanation

Attention based explanations (viz. saliency maps), by providing interpre...
research
07/26/2022

Is Attention Interpretation? A Quantitative Assessment On Sets

The debate around the interpretability of attention mechanisms is center...
research
04/24/2020

A Concept-based Abstraction-Aggregation Deep Neural Network for Interpretable Document Classification

Using attention weights to identify information that is important for mo...
research
02/26/2019

Attention is not Explanation

Attention mechanisms have seen wide adoption in neural NLP models. In ad...
research
08/28/2019

Attention-based Fusion for Outfit Recommendation

This paper describes an attention-based fusion method for outfit recomme...
research
03/07/2022

A Glyph-driven Topology Enhancement Network for Scene Text Recognition

Attention-based methods by establishing one-dimensional (1D) and two-dim...
research
08/16/2023

Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations

Prototypical parts-based networks are becoming increasingly popular due ...

Please sign up or login with your details

Forgot password? Click here to reset