Demystifying the Better Performance of Position Encoding Variants for Transformer

04/18/2021
by   Pu-Chin Chen, et al.
0

Transformers are state of the art models in NLP that map a given input sequence of vectors to an output sequence of vectors. However these models are permutation equivariant, and additive position embeddings to the input are used to supply the information about the order of the input tokens. Further, for some tasks, additional additive segment embeddings are used to denote different types of input sentences. Recent works proposed variations of positional encodings with relative position encodings achieving better performance. In this work, we do a systematic study comparing different position encodings and understanding the reasons for differences in their performance. We demonstrate a simple yet effective way to encode position and segment into the Transformer models. The proposed method performs on par with SOTA on GLUE, XTREME and WMT benchmarks while saving computation costs.

READ FULL TEXT

page 2

page 5

page 14

page 15

research
07/29/2021

Rethinking and Improving Relative Position Encoding for Vision Transformer

Relative position encoding (RPE) is important for transformer to capture...
research
11/08/2022

Word Order Matters when you Increase Masking

Word order, an essential property of natural languages, is injected in T...
research
03/13/2020

Learning to Encode Position for Transformer with Continuous Dynamical Model

We introduce a new way of learning to encode position information for no...
research
02/22/2021

Do We Really Need Explicit Position Encodings for Vision Transformers?

Almost all visual transformers such as ViT or DeiT rely on predefined po...
research
02/13/2023

Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation

Context-aware translation can be achieved by processing a concatenation ...
research
04/18/2022

Dynamic Position Encoding for Transformers

Recurrent models have been dominating the field of neural machine transl...
research
05/18/2021

Relative Positional Encoding for Transformers with Linear Complexity

Recent advances in Transformer models allow for unprecedented sequence l...

Please sign up or login with your details

Forgot password? Click here to reset