Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors

03/29/2021
by   Zeyu Yun, et al.
174

Transformer networks have revolutionized NLP representation learning since they were introduced. Though a great effort has been made to explain the representation in transformers, it is widely recognized that our understanding is not sufficient. One important reason is that there lack enough visualization tools for detailed analysis. In this paper, we propose to use dictionary learning to open up these `black boxes' as linear superpositions of transformer factors. Through visualization, we demonstrate the hierarchical semantic structures captured by the transformer factors, e.g. word-level polysemy disambiguation, sentence-level pattern formation, and long-range dependency. While some of these patterns confirm the conventional prior linguistic knowledge, the rest are relatively unexpected, which may provide new insights. We hope this visualization tool can bring further knowledge and a better understanding of how transformer networks work.

READ FULL TEXT

page 22

page 25

page 26

page 27

page 28

page 30

page 32

page 33

research
10/09/2019

Word Embedding Visualization Via Dictionary Learning

Co-occurrence statistics based word embedding techniques have proved to ...
research
03/26/2021

Dodrio: Exploring Transformer Models with Interactive Visualization

Why do large pre-trained transformer-based models perform so well across...
research
05/04/2023

AttentionViz: A Global View of Transformer Attention

Transformer models are revolutionizing machine learning, but their inner...
research
05/25/2021

Context-Sensitive Visualization of Deep Learning Natural Language Processing Models

The introduction of Transformer neural networks has changed the landscap...
research
02/12/2021

Online Graph Dictionary Learning

Dictionary learning is a key tool for representation learning, that expl...
research
04/26/2019

Transformers with convolutional context for ASR

The recent success of transformer networks for neural machine translatio...
research
09/03/2020

Spatial Transformer Point Convolution

Point clouds are unstructured and unordered in the embedded 3D space. In...

Please sign up or login with your details

Forgot password? Click here to reset