Task and Motion Planning with Large Language Models for Object Rearrangement

03/10/2023
by   Yan Ding, et al.
0

Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning is frequently needed in this process. However, achieving commonsense arrangements requires knowledge about objects, which is hard to transfer to robots. Large language models (LLMs) are one potential source of this knowledge, but they do not naively capture information about plausible physical arrangements of the world. We propose LLM-GROP, which uses prompting to extract commonsense knowledge about semantically valid object configurations from an LLM and instantiates them with a task and motion planner in order to generalize to varying scene geometry. LLM-GROP allows us to go from natural-language commands to human-aligned object rearrangement in varied environments. Based on human evaluations, our approach achieves the highest rating while outperforming competitive baselines in terms of success rate while maintaining comparable cumulative action costs. Finally, we demonstrate a practical implementation of LLM-GROP on a mobile manipulator in real-world scenarios. Supplementary materials are available at: https://sites.google.com/view/llm-grop

READ FULL TEXT

page 1

page 4

page 6

research
05/27/2023

Integrating Action Knowledge and LLMs for Task Planning and Situation Handling in Open Worlds

Task planning systems have been developed to help robots use human knowl...
research
05/23/2023

Large Language Models as Commonsense Knowledge for Large-Scale Task Planning

Natural language provides a natural interface for human communication, y...
research
06/06/2022

Neuro-Symbolic Causal Language Planning with Commonsense Prompting

Language planning aims to implement complex high-level goals by decompos...
research
10/04/2022

Robot Task Planning and Situation Handling in Open Worlds

Automated task planning algorithms have been developed to help robots co...
research
09/06/2022

Reconstructing Action-Conditioned Human-Object Interactions Using Commonsense Knowledge Priors

We present a method for inferring diverse 3D models of human-object inte...
research
01/27/2023

Learning the Effects of Physical Actions in a Multi-modal Environment

Large Language Models (LLMs) handle physical commonsense information ina...
research
09/19/2023

Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill

Zero-shot object navigation is a challenging task for home-assistance ro...

Please sign up or login with your details

Forgot password? Click here to reset