Effective Attention Sheds Light On Interpretability

05/18/2021
by   Kaiser Sun, et al.
0

An attention matrix of a transformer self-attention sublayer can provably be decomposed into two components and only one of them (effective attention) contributes to the model output. This leads us to ask whether visualizing effective attention gives different conclusions than interpretation of standard attention. Using a subset of the GLUE tasks and BERT, we carry out an analysis to compare the two attention matrices, and show that their interpretations differ. Effective attention is less associated with the features related to the language modeling pretraining such as the separator token, and it has more potential to illustrate linguistic features captured by the model for solving the end-task. Given the found differences, we recommend using effective attention for studying a transformer's behavior since it is more pertinent to the model output by design.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 8

page 10

research
05/02/2020

Synthesizer: Rethinking Self-Attention in Transformer Models

The dot product self-attention is known to be central and indispensable ...
research
08/30/2021

Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning

The highly popular Transformer architecture, based on self-attention, is...
research
08/21/2019

Revealing the Dark Secrets of BERT

BERT-based architectures currently give state-of-the-art performance on ...
research
03/05/2021

Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth

Attention-based architectures have become ubiquitous in machine learning...
research
04/21/2020

Attention Module is Not Only a Weight: Analyzing Transformers with Vector Norms

Because attention modules are core components of Transformer-based model...
research
06/07/2021

On the Expressive Power of Self-Attention Matrices

Transformer networks are able to capture patterns in data coming from ma...
research
05/23/2022

Outliers Dimensions that Disrupt Transformers Are Driven by Frequency

Transformer-based language models are known to display anisotropic behav...

Please sign up or login with your details

Forgot password? Click here to reset