OpenD: A Benchmark for Language-Driven Door and Drawer Opening

12/10/2022
by   Yizhou Zhao, et al.
10

We introduce OPEND, a benchmark for learning how to use a hand to open cabinet doors or drawers in a photo-realistic and physics-reliable simulation environment driven by language instruction. To solve the task, we propose a multi-step planner composed of a deep neural network and rule-base controllers. The network is utilized to capture spatial relationships from images and understand semantic meaning from language instructions. Controllers efficiently execute the plan based on the spatial and semantic understanding. We evaluate our system by measuring its zero-shot performance in test data set. Experimental results demonstrate the effectiveness of decision planning by our multi-step planner for different hands, while suggesting that there is significant room for developing better models to address the challenge brought by language understanding, spatial reasoning, and long-term manipulation. We will release OPEND and host challenges to promote future research in this area.

READ FULL TEXT

page 1

page 3

page 4

page 6

research
06/20/2023

RM-PRT: Realistic Robotic Manipulation Simulator and Benchmark with Progressive Reasoning Tasks

Recently, the advent of pre-trained large-scale language models (LLMs) l...
research
07/19/2022

Target-Driven Structured Transformer Planner for Vision-Language Navigation

Vision-language navigation is the task of directing an embodied agent to...
research
05/23/2023

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language und...
research
07/12/2021

A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution

Natural language provides an accessible and expressive interface to spec...
research
01/16/2018

Grounded Language Understanding for Manipulation Instructions Using GAN-Based Classification

The target task of this study is grounded language understanding for dom...
research
08/11/2023

Dynamic Planning with a LLM

While Large Language Models (LLMs) can solve many NLP tasks in zero-shot...

Please sign up or login with your details

Forgot password? Click here to reset