Large Language Models Are Not Abstract Reasoners

05/31/2023
by   Gaël Gendron, et al.
8

Large Language Models have shown tremendous performance on a large variety of natural language processing tasks, ranging from text comprehension to common sense reasoning. However, the mechanisms responsible for this success remain unknown, and it is unclear whether LLMs can achieve human-like cognitive capabilities or whether these models are still fundamentally limited. Abstract reasoning is a fundamental task for cognition, consisting of finding and applying a general pattern from few data. Evaluating deep neural architectures on this task could give insight into their potential limitations regarding reasoning and their broad generalisation abilities, yet this is currently an under-explored area. In this paper, we perform extensive evaluations of state-of-the-art LLMs on abstract reasoning tasks, showing that they achieve very limited performance in contrast with other natural language tasks, and we investigate the reasons for this difference. We apply techniques that have been shown to improve performance on other NLP tasks and show that in most cases their impact on abstract reasoning performance is limited. In the course of this work, we have generated a new benchmark for evaluating language models on abstract reasoning tasks.

READ FULL TEXT

page 14

page 18

page 22

page 23

page 27

page 29

page 30

page 33

research
07/14/2022

Language models show human-like content effects on reasoning

Abstract reasoning is a key ability for an intelligent system. Large lan...
research
10/05/2021

Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning

Large natural language models (such as GPT-3 or T5) demonstrate impressi...
research
06/16/2023

No Strong Feelings One Way or Another: Re-operationalizing Neutrality in Natural Language Inference

Natural Language Inference (NLI) has been a cornerstone task in evaluati...
research
07/03/2023

Evaluating Shutdown Avoidance of Language Models in Textual Scenarios

Recently, there has been an increase in interest in evaluating large lan...
research
06/08/2023

covLLM: Large Language Models for COVID-19 Biomedical Literature

The COVID-19 pandemic led to 1.1 million deaths in the United States, de...
research
05/12/2022

Is the Computation of Abstract Sameness Relations Human-Like in Neural Language Models?

In recent years, deep neural language models have made strong progress i...
research
08/09/2022

Limitations of Language Models in Arithmetic and Symbolic Induction

Recent work has shown that large pretrained Language Models (LMs) can no...

Please sign up or login with your details

Forgot password? Click here to reset