Autoregressive Modeling with Lookahead Attention

05/20/2023
by   Li Du, et al.
0

To predict the next token, autoregressive models ordinarily examine the past. Could they also benefit from also examining hypothetical futures? We consider a novel Transformer-based autoregressive architecture that estimates the next-token distribution by extrapolating multiple continuations of the past, according to some proposal distribution, and attending to these extended strings. This architecture draws insights from classical AI systems such as board game players: when making a local decision, a policy may benefit from exploring possible future trajectories and analyzing them. On multiple tasks including morphological inflection and Boolean satisfiability, our lookahead model is able to outperform the ordinary Transformer model of comparable size. However, on some tasks, it appears to be benefiting from the extra computation without actually using the lookahead information. We discuss possible variant architectures as well as future speedups.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2020

Insertion-Based Modeling for End-to-End Automatic Speech Recognition

End-to-end (E2E) models have gained attention in the research field of a...
research
09/09/2022

Improved Masked Image Generation with Token-Critic

Non-autoregressive generative transformers recently demonstrated impress...
research
05/02/2020

Synthesizer: Rethinking Self-Attention in Transformer Models

The dot product self-attention is known to be central and indispensable ...
research
11/26/2022

How Crucial is Transformer in Decision Transformer?

Decision Transformer (DT) is a recently proposed architecture for Reinfo...
research
07/22/2021

FNetAR: Mixing Tokens with Autoregressive Fourier Transforms

In this note we examine the autoregressive generalization of the FNet al...
research
06/05/2020

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

With the success of language pretraining, it is highly desirable to deve...

Please sign up or login with your details

Forgot password? Click here to reset