Task-Oriented Grasp Prediction with Visual-Language Inputs

02/28/2023
by   Chao Tang, et al.
0

To perform household tasks, assistive robots receive commands in the form of user language instructions for tool manipulation. The initial stage involves selecting the intended tool (i.e., object grounding) and grasping it in a task-oriented manner (i.e., task grounding). Nevertheless, prior researches on visual-language grasping (VLG) focus on object grounding, while disregarding the fine-grained impact of tasks on object grasping. Task-incompatible grasping of a tool will inevitably limit the success of subsequent manipulation steps. Motivated by this problem, this paper proposes GraspCLIP, which addresses the challenge of task grounding in addition to object grounding to enable task-oriented grasp prediction with visual-language inputs. Evaluation on a custom dataset demonstrates that GraspCLIP achieves superior performance over established baselines with object grounding only. The effectiveness of the proposed method is further validated on an assistive robotic arm platform for grasping previously unseen kitchen tools given the task specification. Our presentation video is available at: https://www.youtube.com/watch?v=e1wfYQPeAXU.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
02/24/2023

A Joint Modeling of Vision-Language-Action for Target-oriented Grasping in Clutter

We focus on the task of language-conditioned grasping in clutter, in whi...
research
01/27/2023

Learning 6-DoF Fine-grained Grasp Detection Based on Part Affordance Grounding

Robotic grasping is a fundamental ability for a robot to interact with t...
research
06/25/2018

Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision

Tool manipulation is vital for facilitating robots to complete challengi...
research
07/25/2023

GraspGPT: Leveraging Semantic Knowledge from a Large Language Model for Task-Oriented Grasping

Task-oriented grasping (TOG) refers to the problem of predicting grasps ...
research
10/27/2019

Task-Oriented Language Grounding for Language Input with Multiple Sub-Goals of Non-Linear Order

In this work, we analyze the performance of general deep reinforcement l...
research
05/25/2019

Reasoning on Grasp-Action Affordances

Artificial intelligence is essential to succeed in challenging activitie...
research
09/22/2021

Audio-Visual Grounding Referring Expression for Robotic Manipulation

Referring expressions are commonly used when referring to a specific tar...

Please sign up or login with your details

Forgot password? Click here to reset