A Length-Extrapolatable Transformer

12/20/2022
by   Yutao Sun, et al.
0

Position modeling plays a critical role in Transformers. In this paper, we focus on length extrapolation, i.e., training on short texts while evaluating longer sequences. We define attention resolution as an indicator of extrapolation. Then we propose two designs to improve the above metric of Transformers. Specifically, we introduce a relative position embedding to explicitly maximize attention resolution. Moreover, we use blockwise causal attention during inference for better resolution. We evaluate different Transformer variants with language modeling. Experimental results show that our model achieves strong performance in both interpolation and extrapolation settings. The code will be available at https://aka.ms/LeX-Transformer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2019

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer networks have a potential of learning longer-term dependency...
research
12/31/2020

Shortformer: Better Language Modeling using Shorter Inputs

We explore the benefits of decreasing the input length of transformers. ...
research
05/31/2023

Monotonic Location Attention for Length Generalization

We explore different ways to utilize position-based cross-attention in s...
research
07/19/2023

Exploring Transformer Extrapolation

Length extrapolation has attracted considerable attention recently since...
research
08/27/2021

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation

Since the introduction of the transformer model by Vaswani et al. (2017)...
research
02/22/2021

Do We Really Need Explicit Position Encodings for Vision Transformers?

Almost all visual transformers such as ViT or DeiT rely on predefined po...
research
12/20/2022

Receptive Field Alignment Enables Transformer Length Extrapolation

Length extrapolation is a desirable property that permits training a tra...

Please sign up or login with your details

Forgot password? Click here to reset