PROGrasp: Pragmatic Human-Robot Communication for Object Grasping

09/14/2023
by   Gi-Cheon Kang, et al.
0

Interactive Object Grasping (IOG) is the task of identifying and grasping the desired object via human-robot natural language interaction. Current IOG systems assume that a human user initially specifies the target object's category (e.g., bottle). Inspired by pragmatics, where humans often convey their intentions by relying on context to achieve goals, we introduce a new IOG task, Pragmatic-IOG, and the corresponding dataset, Intention-oriented Multi-modal Dialogue (IM-Dial). In our proposed task scenario, an intention-oriented utterance (e.g., "I am thirsty") is initially given to the robot. The robot should then identify the target object by interacting with a human user. Based on the task setup, we propose a new robotic system that can interpret the user's intention and pick up the target object, Pragmatic Object Grasping (PROGrasp). PROGrasp performs Pragmatic-IOG by incorporating modules for visual grounding, question asking, object grasping, and most importantly, answer interpretation for pragmatic inference. Experimental results show that PROGrasp is effective in offline (i.e., target object discovery) and online (i.e., IOG with a physical robot arm) settings.

READ FULL TEXT

page 1

page 3

page 4

page 6

research
03/17/2021

Few-Shot Visual Grounding for Natural Human-Robot Interaction

Natural Human-Robot Interaction (HRI) is one of the key components for s...
research
05/28/2018

Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration

In this paper, we propose the Interactive Text2Pickup (IT2P) network for...
research
05/24/2005

Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks

A major challenge for the realization of intelligent robots is to supply...
research
05/17/2019

When the goal is to generate a series of activities: A self-organized simulated robot arm

Behavior is characterized by sequences of goal-oriented conducts, such a...
research
08/30/2023

WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model

Enabling robots to understand language instructions and react accordingl...
research
10/05/2019

Early Estimation of User's Intention of Tele-Operation Using Object Affordance and Hand Motion in a Dual First-Person Vision

This paper describes a method of estimating the intention of a user's mo...
research
03/15/2022

Interactive Robotic Grasping with Attribute-Guided Disambiguation

Interactive robotic grasping using natural language is one of the most f...

Please sign up or login with your details

Forgot password? Click here to reset