Planning with Large Language Models via Corrective Re-prompting

by   Shreyas Sundara Raman, et al.

Extracting the common sense knowledge present in Large Language Models (LLMs) offers a path to designing intelligent, embodied agents. Related works have queried LLMs with a wide-range of contextual information, such as goals, sensor observations and scene descriptions, to generate high-level action plans for specific tasks; however these approaches often involve human intervention or additional machinery to enable sensor-motor interactions. In this work, we propose a prompting-based strategy for extracting executable plans from an LLM, which leverages a novel and readily-accessible source of information: precondition errors. Our approach assumes that actions are only afforded execution in certain contexts, i.e., implicit preconditions must be met for an action to execute (e.g., a door must be unlocked to open it), and that the embodied agent has the ability to determine if the action is/is not executable in the current context (e.g., detect if a precondition error is present). When an agent is unable to execute an action, our approach re-prompts the LLM with precondition error information to extract an executable corrective action to achieve the intended goal in the current context. We evaluate our approach in the VirtualHome simulation environment on 88 different tasks and 7 scenes. We evaluate different prompt templates and compare to methods that naively re-sample actions from the LLM. Our approach, using precondition errors, improves executability and semantic correctness of plans, while also reducing the number of re-prompts required when querying actions.


page 14

page 15

page 16


Generating Executable Action Plans with Environmentally-Aware Language Models

Large Language Models (LLMs) trained using massive text datasets have re...

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Can world knowledge learned by large language models (LLMs) be used to a...

You Only Look at Screens: Multimodal Chain-of-Action Agents

Autonomous user interface (UI) agents aim to facilitate task automation ...

Every Action Based Sensor

In studying robots and planning problems, a basic question is what is th...

Extracting Action Sequences from Texts Based on Deep Reinforcement Learning

Extracting action sequences from texts in natural language is challengin...

Grounding Classical Task Planners via Vision-Language Models

Classical planning systems have shown great advances in utilizing rule-b...

Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

We propose a novel approach for planning agents to compose abstract skil...

Please sign up or login with your details

Forgot password? Click here to reset