Visualizing Attention in Transformer-Based Language models

04/04/2019
by   Jesse Vig, et al.
0

We present an open-source tool for visualizing multi-head self-attention in Transformer-based language models. The tool extends earlier work by visualizing attention at three levels of granularity: the attention-head level, the model level, and the neuron level. We describe how each of these views can help to interpret the model, and we demonstrate the tool on the OpenAI GPT-2 pretrained language model. We also present three use cases showing how the tool might provide insights on how to adapt or improve the model.

READ FULL TEXT
research
04/04/2019

Visualizing Attention in Transformer-Based Language Representation Models

We present an open-source tool for visualizing multi-head self-attention...
research
10/11/2019

exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models

Large language models can produce powerful contextual representations th...
research
06/12/2019

A Multiscale Visualization of Attention in the Transformer Model

The Transformer is a sequence model that forgoes traditional recurrent a...
research
04/26/2022

LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models

The opaque nature and unexplained behavior of transformer-based language...
research
05/26/2023

On the Computational Power of Decoder-Only Transformer Language Models

This article presents a theoretical evaluation of the computational univ...
research
05/25/2023

On the Tool Manipulation Capability of Open-source Large Language Models

Recent studies on software tool manipulation with large language models ...

Please sign up or login with your details

Forgot password? Click here to reset