Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention

04/26/2021
by   Soo Hyun Ryu, et al.
0

We advance a novel explanation of similarity-based interference effects in subject-verb and reflexive pronoun agreement processing, grounded in surprisal values computed from a pretrained large-scale Transformer model, GPT-2. Specifically, we show that surprisal of the verb or reflexive pronoun predicts facilitatory interference effects in ungrammatical sentences, where a distractor noun that matches in number with the verb or pronoun leads to faster reading times, despite the distractor not participating in the agreement relation. We review the human empirical evidence for such effects, including recent meta-analyses and large-scale studies. We also show that attention patterns (indexed by entropy and other measures) in the Transformer show patterns of diffuse attention in the presence of similar distractors, consistent with cue-based retrieval models of parsing. But in contrast to these models, the attentional cues and memory representations are learned entirely from the simple self-supervised task of predicting the next word.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2022

Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal

Transformer-based large language models are trained to make predictions ...
research
04/12/2021

Multilingual Language Models Predict Human Reading Behavior

We analyze if large language models are able to predict patterns of huma...
research
05/19/2020

Comparing Transformers and RNNs on predicting human sentence processing data

Recurrent neural networks (RNNs) have long been an architecture of inter...
research
03/12/2017

Feature overwriting as a finite mixture process: Evidence from comprehension data

The ungrammatical sentence "The key to the cabinets are on the table" is...
research
12/08/2022

Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement

The long-distance agreement, evidence for syntactic structure, is increa...
research
05/23/2023

Assessing Linguistic Generalisation in Language Models: A Dataset for Brazilian Portuguese

Much recent effort has been devoted to creating large-scale language mod...
research
03/17/2016

Modeling self-organization of vocabularies under phonological similarity effects

This work develops a computational model (by Automata Networks) of phono...

Please sign up or login with your details

Forgot password? Click here to reset