Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

04/04/2022
by   Michael Ahn, et al.
2

Large language models can encode a wealth of semantic knowledge about the world. Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language. However, a significant weakness of language models is that they lack real-world experience, which makes it difficult to leverage them for decision making within a given embodiment. For example, asking a language model to describe how to clean a spill might result in a reasonable narrative, but it may not be applicable to a particular agent, such as a robot, that needs to perform this task in a particular environment. We propose to provide real-world grounding by means of pretrained skills, which are used to constrain the model to propose natural language actions that are both feasible and contextually appropriate. The robot can act as the language model's "hands and eyes," while the language model supplies high-level semantic knowledge about the task. We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment. We evaluate our method on a number of real-world robotic tasks, where we show the need for real-world grounding and that this approach is capable of completing long-horizon, abstract, natural language instructions on a mobile manipulator. The project's website and the video can be found at https://say-can.github.io/

READ FULL TEXT

page 5

page 8

page 11

page 12

page 27

page 29

page 32

page 33

research
03/01/2023

Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control

Recent progress in large language models (LLMs) has demonstrated the abi...
research
10/04/2022

Grounding Language with Visual Affordances over Unstructured Data

Recent works have shown that Large Language Models (LLMs) can be applied...
research
01/18/2022

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Can world knowledge learned by large language models (LLMs) be used to a...
research
07/01/2023

DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment

Large language models encode a vast amount of semantic knowledge and pos...
research
09/18/2023

Prompt a Robot to Walk with Large Language Models

Large language models (LLMs) pre-trained on vast internet-scale data hav...
research
07/12/2022

Inner Monologue: Embodied Reasoning through Planning with Language Models

Recent works have shown how the reasoning capabilities of Large Language...
research
10/07/2022

See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction

Cognitive planning is the structural decomposition of complex tasks into...

Please sign up or login with your details

Forgot password? Click here to reset