GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers

by   Ali Modarressi, et al.

There has been a growing interest in interpreting the underlying dynamics of Transformers. While self-attention patterns were initially deemed as the primary option, recent studies have shown that integrating other components can yield more accurate explanations. This paper introduces a novel token attribution analysis method that incorporates all the components in the encoder block and aggregates this throughout layers. Through extensive quantitative and qualitative experiments, we demonstrate that our method can produce faithful and meaningful global token attributions. Our experiments reveal that incorporating almost every encoder component results in increasingly more accurate analysis in both local (single layer) and global (the whole model) settings. Our global attribution analysis significantly outperforms previous methods on various tasks regarding correlation with gradient-based saliency scores. Our code is freely available at


page 1

page 12

page 13

page 14


Inserting Information Bottlenecks for Attribution in Transformers

Pretrained transformers achieve the state of the art across tasks in nat...

Measuring the Mixing of Contextual Information in the Transformer

The Transformer architecture aggregates input information through the se...

Multi Resolution Analysis (MRA) for Approximate Self-Attention

Transformers have emerged as a preferred model for many tasks in natural...

Self-Attention Attribution: Interpreting Information Interactions Inside Transformer

The great success of Transformer-based models benefits from the powerful...

Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Detection transformers have recently shown promising object detection re...

OadTR: Online Action Detection with Transformers

Most recent approaches for online action detection tend to apply Recurre...

Finding patterns in Knowledge Attribution for Transformers

We analyze the Knowledge Neurons framework for the attribution of factua...