Few-Shot Visual Grounding for Natural Human-Robot Interaction

03/17/2021
by   Giorgos Tziafas, et al.
0

Natural Human-Robot Interaction (HRI) is one of the key components for service robots to be able to work in human-centric environments. In such dynamic environments, the robot needs to understand the intention of the user to accomplish a task successfully. Towards addressing this point, we propose a software architecture that segments a target object from a crowded scene, indicated verbally by a human user. At the core of our system, we employ a multi-modal deep neural network for visual grounding. Unlike most grounding methods that tackle the challenge using pre-trained object detectors via a two-stepped process, we develop a single stage zero-shot model that is able to provide predictions in unseen data. We evaluate the performance of the proposed model on real RGB-D data collected from public scene datasets. Experimental results showed that the proposed model performs well in terms of accuracy and speed, while showcasing robustness to variation in the natural language input.

READ FULL TEXT

page 1

page 5

page 6

research
09/14/2023

PROGrasp: Pragmatic Human-Robot Communication for Object Grasping

Interactive Object Grasping (IOG) is the task of identifying and graspin...
research
06/11/2018

Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction

This paper presents INGRESS, a robot system that follows human natural l...
research
09/05/2022

Trust in Language Grounding: a new AI challenge for human-robot teams

The challenge of language grounding is to fully understand natural langu...
research
03/23/2023

ScanERU: Interactive 3D Visual Grounding based on Embodied Reference Understanding

Aiming to link natural language descriptions to specific regions in a 3D...
research
12/12/2018

Towards Understanding Language through Perception in Situated Human-Robot Interaction: From Word Grounding to Grammar Induction

Robots are widely collaborating with human users in diferent tasks that ...
research
05/24/2022

Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution

Service robots should be able to interact naturally with non-expert huma...
research
07/05/2020

Unsupervised Online Grounding of Natural Language during Human-Robot Interactions

Allowing humans to communicate through natural language with robots requ...

Please sign up or login with your details

Forgot password? Click here to reset