Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation

08/20/2023
by   Jinyu Chen, et al.
0

Audio-visual navigation is an audio-targeted wayfinding task where a robot agent is entailed to travel a never-before-seen 3D environment towards the sounding source. In this article, we present ORAN, an omnidirectional audio-visual navigator based on cross-task navigation skill transfer. In particular, ORAN sharpens its two basic abilities for a such challenging task, namely wayfinding and audio-visual information gathering. First, ORAN is trained with a confidence-aware cross-task policy distillation (CCPD) strategy. CCPD transfers the fundamental, point-to-point wayfinding skill that is well trained on the large-scale PointGoal task to ORAN, so as to help ORAN to better master audio-visual navigation with far fewer training samples. To improve the efficiency of knowledge transfer and address the domain gap, CCPD is made to be adaptive to the decision confidence of the teacher policy. Second, ORAN is equipped with an omnidirectional information gathering (OIG) mechanism, i.e., gleaning visual-acoustic observations from different directions before decision-making. As a result, ORAN yields more robust navigation behaviour. Taking CCPD and OIG together, ORAN significantly outperforms previous competitors. After the model ensemble, we got 1st in Soundspaces Challenge 2022, improving SPL and SR by 53

READ FULL TEXT

page 1

page 3

page 4

page 8

research
08/21/2020

Learning to Set Waypoints for Audio-Visual Navigation

In audio-visual navigation, an agent intelligently travels through a com...
research
06/01/2022

Towards Generalisable Audio Representations for Audio-Visual Navigation

In audio-visual navigation (AVN), an intelligent agent needs to navigate...
research
10/14/2022

AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments

Recent years have seen embodied visual navigation advance in two distinc...
research
12/21/2020

Semantic Audio-Visual Navigation

Recent work on audio-visual navigation assumes a constantly-sounding tar...
research
10/30/2022

Towards Versatile Embodied Navigation

With the emergence of varied visual navigation tasks (e.g, image-/object...
research
10/04/2022

Pay Self-Attention to Audio-Visual Navigation

Audio-visual embodied navigation, as a hot research topic, aims training...
research
06/16/2022

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based a...

Please sign up or login with your details

Forgot password? Click here to reset