Learning by Asking for Embodied Visual Navigation and Task Completion

02/09/2023
by   Ying Shen, et al.
0

The research community has shown increasing interest in designing intelligent embodied agents that can assist humans in accomplishing tasks. Despite recent progress on related vision-language benchmarks, most prior work has focused on building agents that follow instructions rather than endowing agents the ability to ask questions to actively resolve ambiguities arising naturally in embodied environments. To empower embodied agents with the ability to interact with humans, in this work, we propose an Embodied Learning-By-Asking (ELBA) model that learns when and what questions to ask to dynamically acquire additional information for completing the task. We evaluate our model on the TEACH vision-dialog navigation and task completion dataset. Experimental results show that ELBA achieves improved task performance compared to baseline models without question-answering capabilities.

READ FULL TEXT
research
05/02/2020

RMM: A Recursive Mental Model for Dialog Navigation

Fluent communication requires understanding your audience. In the new co...
research
07/15/2020

Active Visual Information Gathering for Vision-Language Navigation

Vision-language navigation (VLN) is the task of entailing an agent to ca...
research
04/18/2022

Learning to Execute Actions or Ask Clarification Questions

Collaborative tasks are ubiquitous activities where a form of communicat...
research
05/23/2023

R2H: Building Multimodal Navigation Helpers that Respond to Help

The ability to assist humans during a navigation task in a supportive ro...
research
05/18/2023

Transforming Human-Centered AI Collaboration: Redefining Embodied Agents Capabilities through Interactive Grounded Language Instructions

Human intelligence's adaptability is remarkable, allowing us to adjust t...
research
10/12/2021

Decision-Theoretic Question Generation for Situated Reference Resolution: An Empirical Study and Computational Model

Dialogue agents that interact with humans in situated environments need ...
research
11/13/2018

Interpreting Models by Allowing to Ask

Questions convey information about the questioner, namely what one does ...

Please sign up or login with your details

Forgot password? Click here to reset