Pretraining on Non-linguistic Structure as a Tool for Analyzing Learning Bias in Language Models

04/30/2020
by   Isabel Papadimitriou, et al.
0

We propose a novel methodology for analyzing the encoding of grammatical structure in neural language models through transfer learning. We test how a language model can leverage its internal representations to transfer knowledge across languages and symbol systems. We train LSTMs on non-linguistic, structured data and test their performance on human language to assess which kinds of data induce generalizable encodings that LSTMs can use for natural language. We find that models trained on structured data such as music and Java code have internal representations that help in modelling human language, and that, surprisingly, adding minimal amounts of structure to the training data makes a large difference in transfer to natural language. Further experiments on transfer between human languages show that zero-shot performance on a test language is highly correlated with syntactic similarity to the training language, even after removing any vocabulary overlap. This suggests that the internal representations induced from natural languages are typologically coherent: they encode the features and differences outlined in typological studies. Our results provide insights into how neural networks represent linguistic structure, and also about the kinds of structural biases that give learners the ability to model language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2022

SERENGETI: Massively Multilingual Language Models for Africa

Multilingual language models (MLMs) acquire valuable, generalizable ling...
research
03/19/2022

Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models

We investigate what kind of structural knowledge learned in neural netwo...
research
05/29/2018

LSTMs Exploit Linguistic Attributes of Data

While recurrent neural networks have found success in a variety of natur...
research
05/01/2020

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

A standard approach to evaluating language models analyzes how models as...
research
10/16/2020

Inferring symmetry in natural language

We present a methodological framework for inferring symmetry of verb pre...
research
06/09/2022

Ancestor-to-Creole Transfer is Not a Walk in the Park

We aim to learn language models for Creole languages for which large vol...
research
06/12/2023

Large language models and (non-)linguistic recursion

Recursion is one of the hallmarks of human language. While many design f...

Please sign up or login with your details

Forgot password? Click here to reset