The Double Helix inside the NLP Transformer

06/23/2023
by   Jason H. J. Lu, et al.
0

We introduce a framework for analyzing various types of information in an NLP Transformer. In this approach, we distinguish four layers of information: positional, syntactic, semantic, and contextual. We also argue that the common practice of adding positional information to semantic embedding is sub-optimal and propose instead a Linear-and-Add approach. Our analysis reveals an autogenetic separation of positional information through the deep layers. We show that the distilled positional components of the embedding vectors follow the path of a helix, both on the encoder side and on the decoder side. We additionally show that on the encoder side, the conceptual dimensions generate Part-of-Speech (PoS) clusters. On the decoder side, we show that a di-gram approach helps to reveal the PoS clusters of the next token. Our approach paves a way to elucidate the processing of information through the deep layers of an NLP Transformer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2021

Efficient Transformer for Direct Speech Translation

The advent of Transformer-based models has surpassed the barriers of tex...
research
03/21/2020

Analyzing Word Translation of Transformer Layers

The Transformer translation model is popular for its effective paralleli...
research
01/03/2021

An Efficient Transformer Decoder with Compressed Sub-layers

The large attention-based encoder-decoder network (Transformer) has beco...
research
02/08/2021

End-to-End Multi-Channel Transformer for Speech Recognition

Transformers are powerful neural architectures that allow integrating di...
research
01/27/2021

Transformer Based Deliberation for Two-Pass Speech Recognition

Interactive speech recognition systems must generate words quickly while...
research
07/21/2021

Multi-Stream Transformers

Transformer-based encoder-decoder models produce a fused token-wise repr...
research
08/11/2023

An Encoder-Decoder Approach for Packing Circles

The problem of packing smaller objects within a larger object has been o...

Please sign up or login with your details

Forgot password? Click here to reset