Towards Versatile Embodied Navigation

10/30/2022
by   Hanqing Wang, et al.
0

With the emergence of varied visual navigation tasks (e.g, image-/object-/audio-goal and vision-language navigation) that specify the target in different ways, the community has made appealing advances in training specialized agents capable of handling individual navigation tasks well. Given plenty of embodied navigation tasks and task-specific solutions, we address a more fundamental question: can we learn a single powerful agent that masters not one but multiple navigation tasks concurrently? First, we propose VXN, a large-scale 3D dataset that instantiates four classic navigation tasks in standardized, continuous, and audiovisual-rich environments. Second, we propose Vienna, a versatile embodied navigation agent that simultaneously learns to perform the four navigation tasks with one model. Building upon a full-attentive architecture, Vienna formulates various navigation tasks as a unified, parse-and-query procedure: the target description, augmented with four task embeddings, is comprehensively interpreted into a set of diversified goal vectors, which are refined as the navigation progresses, and used as queries to retrieve supportive context from episodic history for decision making. This enables the reuse of knowledge across navigation tasks with varying input domains/modalities. We empirically demonstrate that, compared with learning each visual navigation task individually, our multitask agent achieves comparable or even better performance with reduced complexity.

READ FULL TEXT

page 2

page 6

research
02/05/2022

Zero Experience Required: Plug Play Modular Transfer Learning for Semantic Visual Navigation

In reinforcement learning for visual navigation, it is common to develop...
research
08/21/2020

Exploiting Scene-specific Features for Object Goal Navigation

Can the intrinsic relation between an object and the room in which it is...
research
10/14/2022

AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments

Recent years have seen embodied visual navigation advance in two distinc...
research
11/29/2022

Instance-Specific Image Goal Navigation: Training Embodied Agents to Find Object Instances

We consider the problem of embodied visual navigation given an image-goa...
research
03/06/2023

Robustness of Utilizing Feedback in Embodied Visual Navigation

This paper presents a framework for training an agent to actively reques...
research
02/06/2016

End-to-End Goal-Driven Web Navigation

We propose a goal-driven web navigation as a benchmark task for evaluati...
research
08/20/2023

Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation

Audio-visual navigation is an audio-targeted wayfinding task where a rob...

Please sign up or login with your details

Forgot password? Click here to reset