On the Planning Abilities of Large Language Models – A Critical Investigation

05/25/2023
by   Karthik Valmeekam, et al.
0

Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) the effectiveness of LLMs in generating plans autonomously in commonsense planning tasks and (2) the potential of LLMs as a source of heuristic guidance for other agents (AI planners) in their planning tasks. We conduct a systematic study by generating a suite of instances on domains similar to the ones employed in the International Planning Competition and evaluate LLMs in two distinct modes: autonomous and heuristic. Our findings reveal that LLMs' ability to generate executable plans autonomously is rather limited, with the best model (GPT-4) having an average success rate of  12 mode show more promise. In the heuristic mode, we demonstrate that LLM-generated plans can improve the search process for underlying sound planners and additionally show that external verifiers can help provide feedback on the generated plans and back-prompt the LLM for better plan generation.

READ FULL TEXT

page 16

page 17

page 20

page 21

page 23

page 24

page 27

page 29

research
08/24/2023

SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge

Large Language Models (LLMs) have demonstrated impressive planning abili...
research
05/26/2023

AdaPlanner: Adaptive Planning from Feedback with Language Models

Large language models (LLMs) have recently demonstrated the potential in...
research
05/26/2023

Learning and Leveraging Verifiers to Improve Planning Capabilities of Pre-trained Language Models

There have been wide spread claims in the literature about the emergent ...
research
12/16/2022

Plansformer: Generating Symbolic Plans using Transformers

Large Language Models (LLMs) have been the subject of active research, s...
research
02/19/2021

SQAPlanner: Generating Data-Informed Software Quality Improvement Plans

Software Quality Assurance (SQA) planning aims to define proactive plans...
research
06/30/2011

Taming Numbers and Durations in the Model Checking Integrated Planning System

The Model Checking Integrated Planning System (MIPS) is a temporal least...
research
08/27/2014

Knowledge Engineering for Planning-Based Hypothesis Generation

In this paper, we address the knowledge engineering problems for hypothe...

Please sign up or login with your details

Forgot password? Click here to reset