Comparing Transformers and RNNs on predicting human sentence processing data

05/19/2020
by   Danny Merkx, et al.
0

Recurrent neural networks (RNNs) have long been an architecture of interest for computational models of human sentence processing. The more recently introduced Transformer architecture has been shown to outperform recurrent neural networks on many natural language processing tasks but little is known about their ability to model human language processing. It has long been thought that human sentence reading involves something akin to recurrence and so RNNs may still have an advantage over the Transformer as a cognitive model. In this paper we train both Transformer and RNN based language models and compare their performance as a model of human sentence processing. We use the trained language models to compute surprisal values for the stimuli used in several reading experiments and use mixed linear modelling to measure how well the surprisal explains measures of human reading effort. Our analysis shows that the Transformers outperform the RNNs as cognitive models in explaining self-paced reading times and N400 strength but not gaze durations from an eye-tracking experiment.

READ FULL TEXT
research
07/20/2021

Different kinds of cognitive plausibility: why are transformers better than RNNs at predicting N400 amplitude?

Despite being designed for performance rather than cognitive plausibilit...
research
10/11/2021

Leveraging Transformers for StarCraft Macromanagement Prediction

Inspired by the recent success of transformers in natural language proce...
research
12/23/2022

Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?

This work presents a detailed linguistic analysis into why larger Transf...
research
02/02/2022

Language Models Explain Word Reading Times Better Than Empirical Predictability

Though there is a strong consensus that word length and frequency are th...
research
03/10/2023

Does ChatGPT resemble humans in language use?

Large language models (LLMs) and LLM-driven chatbots such as ChatGPT hav...
research
04/12/2021

Multilingual Language Models Predict Human Reading Behavior

We analyze if large language models are able to predict patterns of huma...

Please sign up or login with your details

Forgot password? Click here to reset