A Multiscale Visualization of Attention in the Transformer Model

06/12/2019
by   Jesse Vig, et al.
0

The Transformer is a sequence model that forgoes traditional recurrent architectures in favor of a fully attention-based approach. Besides improving performance, an advantage of using attention is that it can also help to interpret a model by showing how the model assigns weight to different input elements. However, the multi-layer, multi-head attention mechanism in the Transformer model can be difficult to decipher. To make the model more accessible, we introduce an open-source tool that visualizes attention at multiple scales, each of which provides a unique perspective on the attention mechanism. We demonstrate the tool on BERT and OpenAI GPT-2 and present three example use cases: detecting model bias, locating relevant attention heads, and linking neurons to model behavior.

READ FULL TEXT
research
04/04/2019

Visualizing Attention in Transformer-Based Language Representation Models

We present an open-source tool for visualizing multi-head self-attention...
research
06/17/2021

Multi-head or Single-head? An Empirical Comparison for Transformer Training

Multi-head attention plays a crucial role in the recent success of Trans...
research
03/26/2021

Dodrio: Exploring Transformer Models with Interactive Visualization

Why do large pre-trained transformer-based models perform so well across...
research
07/15/2019

Agglomerative Attention

Neural networks using transformer-based architectures have recently demo...
research
06/19/2022

Learning Multiscale Transformer Models for Sequence Generation

Multiscale feature hierarchies have been witnessed the success in the co...
research
09/29/2020

Attention that does not Explain Away

Models based on the Transformer architecture have achieved better accura...
research
04/04/2019

Visualizing Attention in Transformer-Based Language models

We present an open-source tool for visualizing multi-head self-attention...

Please sign up or login with your details

Forgot password? Click here to reset