Interactive Robotic Grasping with Attribute-Guided Disambiguation

03/15/2022
by   Yang Yang, et al.
0

Interactive robotic grasping using natural language is one of the most fundamental tasks in human-robot interaction. However, language can be a source of ambiguity, particularly when there are ambiguous visual or linguistic contents. This paper investigates the use of object attributes in disambiguation and develops an interactive grasping system capable of effectively resolving ambiguities via dialogues. Our approach first predicts target scores and attribute scores through vision-and-language grounding. To handle ambiguous objects and commands, we propose an attribute-guided formulation of the partially observable Markov decision process (Attr-POMDP) for disambiguation. The Attr-POMDP utilizes target and attribute scores as the observation model to calculate the expected return of an attribute-based (e.g., "what is the color of the target, red or green?") or a pointing-based (e.g., "do you mean this one?") question. Our disambiguation module runs in real time on a real robot, and the interactive grasping system achieves a 91.43% selection accuracy in the real-robot experiments, outperforming several baselines by large margins.

READ FULL TEXT

page 1

page 3

page 5

page 6

research
08/25/2021

INVIGORATE: Interactive Visual Grounding and Grasping in Clutter

This paper presents INVIGORATE, a robot system that interacts with human...
research
06/06/2021

Planning Multimodal Exploratory Actions for Online Robot Attribute Learning

Robots frequently need to perceive object attributes, such as "red," "he...
research
09/14/2023

PROGrasp: Pragmatic Human-Robot Communication for Object Grasping

Interactive Object Grasping (IOG) is the task of identifying and graspin...
research
06/17/2023

CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents

In this paper, we focus on inferring whether the given user command is c...
research
08/30/2023

WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model

Enabling robots to understand language instructions and react accordingl...
research
01/27/2023

Learning 6-DoF Fine-grained Grasp Detection Based on Part Affordance Grounding

Robotic grasping is a fundamental ability for a robot to interact with t...
research
05/03/2017

Concurrent Constraint Conditional-Branching Timed Interactive Scores

Multimedia scenarios have multimedia content and interactive events asso...

Please sign up or login with your details

Forgot password? Click here to reset