Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

11/04/2016
by   Tal Linzen, et al.
0

The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities. Linguistic regularities are often sensitive to syntactic structure; can such dependencies be captured by LSTMs, which do not have explicit structural representations? We begin addressing this question using number agreement in English subject-verb dependencies. We probe the architecture's grammatical competence both using training objectives with an explicit grammatical target (number prediction, grammaticality judgments) and using language models. In the strongly supervised settings, the LSTM achieved very high overall accuracy (less than 1 sequential and structural information conflicted. The frequency of such errors rose sharply in the language-modeling setting. We conclude that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2020

How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?

Long short-term memory (LSTM) networks and their variants are capable of...
research
08/11/2022

Assessing the Unitary RNN as an End-to-End Compositional Model of Syntax

We show that both an LSTM and a unitary-evolution recurrent neural netwo...
research
04/30/2020

Attribution Analysis of Grammatical Dependencies in LSTMs

LSTM language models have been shown to capture syntax-sensitive grammat...
research
11/06/2018

Evaluating the Ability of LSTMs to Learn Context-Free Grammars

While long short-term memory (LSTM) neural net architectures are designe...
research
03/03/2019

Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

State-of-the-art LSTM language models trained on large corpora learn seq...
research
05/29/2021

Predictive Representation Learning for Language Modeling

To effectively perform the task of next-word prediction, long short-term...
research
10/10/2020

Discourse structure interacts with reference but not syntax in neural language models

Language models (LMs) trained on large quantities of text have been clai...

Please sign up or login with your details

Forgot password? Click here to reset