Memory Augmented Large Language Models are Computationally Universal

01/10/2023
by   Dale Schuurmans, et al.
0

We show that transformer-based large language models are computationally universal when augmented with an external memory. Any deterministic language model that conditions on strings of bounded length is equivalent to a finite automaton, hence computationally limited. However, augmenting such models with a read-write memory creates the possibility of processing arbitrarily large inputs and, potentially, simulating any algorithm. We establish that an existing large language model, Flan-U-PaLM 540B, can be combined with an associative read-write memory to exactly simulate the execution of a universal Turing machine, U_15,2. A key aspect of the finding is that it does not require any modification of the language model weights. Instead, the construction relies solely on designing a form of stored instruction computer that can subsequently be programmed with a specific set of prompts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2016

Improving Neural Language Models with a Continuous Cache

We propose an extension to neural network language models to adapt their...
research
09/08/2021

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

Measuring event salience is essential in the understanding of stories. T...
research
02/15/2017

Frustratingly Short Attention Spans in Neural Language Modeling

Neural language models predict the next token using a latent representat...
research
05/23/2023

RET-LLM: Towards a General Read-Write Memory for Large Language Models

Large language models (LLMs) have significantly advanced the field of na...
research
03/04/2023

Could a Large Language Model be Conscious?

There has recently been widespread discussion of whether large language ...
research
02/28/2016

Lie Access Neural Turing Machine

Following the recent trend in explicit neural memory structures, we pres...
research
06/06/2023

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

We introduce Inference-Time Intervention (ITI), a technique designed to ...

Please sign up or login with your details

Forgot password? Click here to reset