Limitations of Language Models in Arithmetic and Symbolic Induction

08/09/2022
by   Jing Qian, et al.
3

Recent work has shown that large pretrained Language Models (LMs) can not only perform remarkably well on a range of Natural Language Processing (NLP) tasks but also start improving on reasoning tasks such as arithmetic induction, symbolic manipulation, and commonsense reasoning with increasing size of models. However, it is still unclear what the underlying capabilities of these LMs are. Surprisingly, we find that these models have limitations on certain basic symbolic manipulation tasks such as copy, reverse, and addition. When the total number of symbols or repeating symbols increases, the model performance drops quickly. We investigate the potential causes behind this phenomenon and examine a set of possible methods, including explicit positional markers, fine-grained computation steps, and LMs with callable programs. Experimental results show that none of these techniques can solve the simplest addition induction problem completely. In the end, we introduce LMs with tutor, which demonstrates every single step of teaching. LMs with tutor is able to deliver 100 on the boundary of large LMs in induction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2023

Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

The emergent few-shot reasoning capabilities of Large Language Models (L...
research
05/29/2023

Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large Language Models

Large language models (LLMs) have scaled up to unlock a wide range of co...
research
05/31/2023

Large Language Models Are Not Abstract Reasoners

Large Language Models have shown tremendous performance on a large varie...
research
10/05/2021

Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning

Large natural language models (such as GPT-3 or T5) demonstrate impressi...
research
06/11/2023

Inductive reasoning in humans and large language models

The impressive recent performance of large language models has led many ...
research
11/28/2022

GPT-Neo for commonsense reasoning-a theoretical and practical lens

Recent work has demonstrated substantial gains in pre-training large-sca...
research
11/30/2021

Show Your Work: Scratchpads for Intermediate Computation with Language Models

Large pre-trained language models perform remarkably well on tasks that ...

Please sign up or login with your details

Forgot password? Click here to reset