Location Attention for Extrapolation to Longer Sequences

11/10/2019
by   Yann Dubois, et al.
23

Neural networks are surprisingly good at interpolating and perform remarkably well when the training set examples resemble those in the test set. However, they are often unable to extrapolate patterns beyond the seen data, even when the abstractions required for such patterns are simple. In this paper, we first review the notion of extrapolation, why it is important and how one could hope to tackle it. We then focus on a specific type of extrapolation which is especially useful for natural language processing: generalization to sequences that are longer than the training ones. We hypothesize that models with a separate content- and location-based attention are more likely to extrapolate than those with common attention mechanisms. We empirically support our claim for recurrent seq2seq models with our proposed attention on variants of the Lookup Table task. This sheds light on some striking failures of neural models for sequences and on possible methods to approaching such issues.

READ FULL TEXT

page 7

page 8

research
04/30/2020

Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language?

Despite the success of language models using neural networks, it remains...
research
02/04/2019

Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

Attention is an increasingly popular mechanism used in a wide range of n...
research
09/17/2019

Learning to Deceive with Attention-Based Explanations

Attention mechanisms are ubiquitous components in neural architectures a...
research
05/29/2018

LSTMs Exploit Linguistic Attributes of Data

While recurrent neural networks have found success in a variety of natur...
research
10/09/2020

How Can Self-Attention Networks Recognize Dyck-n Languages?

We focus on the recognition of Dyck-n (𝒟_n) languages with self-attentio...
research
03/20/2023

Self-Improving-Leaderboard(SIL): A Call for Real-World Centric Natural Language Processing Leaderboards

Leaderboard systems allow researchers to objectively evaluate Natural La...
research
10/23/2019

Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Despite the ability to produce human-level speech for in-domain text, at...

Please sign up or login with your details

Forgot password? Click here to reset