DeepAI AI Chat
Log In Sign Up

Translating Natural Language Instructions to Computer Programs for Robot Manipulation

by   Sagar Gubbi Venkatesh, et al.
indian institute of science

It is highly desirable for robots that work alongside humans to be able to understand instructions in natural language. Existing language conditioned imitation learning methods predict the actuator commands from the image observation and the instruction text. Rather than directly predicting actuator commands, we propose translating the natural language instruction to a Python function which when executed queries the scene by accessing the output of the object detector and controls the robot to perform the specified task. This enables the use of non-differentiable modules such as a constraint solver when computing commands to the robot. Moreover, the labels in this setup are significantly more descriptive computer programs rather than teleoperated demonstrations. We show that the proposed method performs better than training a neural network to directly predict the robot actions.


Naming Objects for Vision-and-Language Manipulation

Robot manipulation tasks by natural language instructions need common un...

Spatial Reasoning from Natural Language Instructions for Robot Manipulation

Robots that can manipulate objects in unstructured environments and coll...

Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation

We study the problem of learning a range of vision-based manipulation ta...

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

In recent years, much progress has been made in learning robotic manipul...

LILA: Language-Informed Latent Actions

We introduce Language-Informed Latent Actions (LILA), a framework for le...

Interactive Learning of State Representation through Natural Language Instruction and Explanation

One significant simplification in most previous work on robot learning i...

A Model of Fast Concept Inference with Object-Factorized Cognitive Programs

The ability of humans to quickly identify general concepts from a handfu...