Generalized Planning in PDDL Domains with Pretrained Large Language Models

05/18/2023
by   Tom Silver, et al.
0

Recent work has considered whether large language models (LLMs) can function as planners: given a task, generate a plan. We investigate whether LLMs can serve as generalized planners: given a domain and training tasks, generate a program that efficiently produces plans for other tasks in the domain. In particular, we consider PDDL domains and use GPT-4 to synthesize Python programs. We also consider (1) Chain-of-Thought (CoT) summarization, where the LLM is prompted to summarize the domain and propose a strategy in words before synthesizing the program; and (2) automated debugging, where the program is validated with respect to the training tasks, and in case of errors, the LLM is re-prompted with four types of feedback. We evaluate this approach in seven PDDL domains and compare it to four ablations and four baselines. Overall, we find that GPT-4 is a surprisingly powerful generalized planner. We also conclude that automated debugging is very important, that CoT summarization has non-uniform impact, that GPT-4 is far superior to GPT-3.5, and that just two training tasks are often sufficient for strong generalization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2023

Solving and Generating NPR Sunday Puzzles with Large Language Models

We explore the ability of large language models to solve and generate pu...
research
09/22/2022

ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

Task planning can require defining myriad domain knowledge about the wor...
research
06/01/2019

Efficient Adaptation of Pretrained Transformers for Abstractive Summarization

Large-scale learning of transformer language models has yielded improvem...
research
05/16/2023

Satisfiability-Aided Language Models Using Declarative Prompting

Prior work has combined chain-of-thought prompting in large language mod...
research
05/25/2023

Type Prediction With Program Decomposition and Fill-in-the-Type Training

TypeScript and Python are two programming languages that support optiona...
research
03/24/2023

Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting

Generating low-level robot task plans from high-level natural language i...
research
08/06/2023

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

Large language models (LLMs) have demonstrated remarkable performance ac...

Please sign up or login with your details

Forgot password? Click here to reset