Large Language Models as General Pattern Machines

07/10/2023
by   Suvir Mirchandani, et al.
0

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences – from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstract Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art. Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary. These results suggest that without any additional training, LLMs can serve as general sequence modelers, driven by in-context learning. In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics – from extrapolating sequences of numbers that represent states over time to complete simple motions, to least-to-most prompting of reward-conditioned trajectories that can discover and represent closed-loop policies (e.g., a stabilizing controller for CartPole). While difficult to deploy today for real systems due to latency, context size limitations, and compute costs, the approach of using LLMs to drive low-level control may provide an exciting glimpse into how the patterns among words could be transferred to actions.

READ FULL TEXT

page 1

page 2

page 5

page 9

page 14

page 17

research
12/30/2022

On the Inconsistencies of Conditionals Learned by Masked Language Models

Learning to predict masked tokens in a sequence has been shown to be a p...
research
12/19/2022

Emergent Analogical Reasoning in Large Language Models

The recent advent of large language models - large neural networks train...
research
10/24/2022

FCM: Forgetful Causal Masking Makes Causal Language Models Better Zero-Shot Learners

Large language models (LLM) trained using the next-token-prediction obje...
research
03/30/2023

Going Beyond Nouns With Vision Language Models Using Synthetic Data

Large-scale pre-trained Vision Language (VL) models have shown remar...
research
03/24/2021

Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2

Thinking aloud is an effective meta-cognitive strategy human reasoners a...
research
01/18/2022

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Can world knowledge learned by large language models (LLMs) be used to a...
research
05/30/2019

A Hippocampus Model for Online One-Shot Storage of Pattern Sequences

We present a computational model based on the CRISP theory (Content Repr...

Please sign up or login with your details

Forgot password? Click here to reset