Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks

01/10/2020
by   R. Thomas McCoy, et al.
0

Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection. For both tasks, the training set is consistent with a generalization based on hierarchical structure and a generalization based on linear order. All architectural factors that we investigated qualitatively affected how models generalized, including factors with no clear connection to hierarchical structure. For example, LSTMs and GRUs displayed qualitatively different inductive biases. However, the only factor that consistently contributed a hierarchical bias across tasks was the use of a tree-structured model rather than a model with sequential recurrence, suggesting that human-like syntactic generalization requires architectural syntactic structure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2023

How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases

Accurate syntactic representations are essential for robust generalizati...
research
06/26/2020

What they do when in doubt: a study of inductive biases in seq2seq learners

Sequence-to-sequence (seq2seq) learners are widely used, but we still ha...
research
06/08/2022

Syntactic Inductive Biases for Deep Learning Methods

In this thesis, we try to build a connection between the two schools by ...
research
01/26/2023

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

When acquiring syntax, children consistently choose hierarchical rules o...
research
04/30/2020

Representations of Syntax [MASK] Useful: Effects of Constituency and Dependency Structure in Recursive LSTMs

Sequence-based neural networks show significant sensitivity to syntactic...
research
11/07/2019

BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance

If the same neural architecture is trained multiple times on the same da...
research
03/17/2022

Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models

Relations between words are governed by hierarchical structure rather th...

Please sign up or login with your details

Forgot password? Click here to reset