SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

06/16/2022
by   Changan Chen, et al.
0

We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based audio rendering for 3D environments. Given a 3D mesh of a real-world environment, SoundSpaces can generate highly realistic acoustics for arbitrary sounds captured from arbitrary microphone locations. Together with existing 3D visual assets, it supports an array of audio-visual research tasks, such as audio-visual navigation, mapping, source localization and separation, and acoustic matching. Compared to existing resources, SoundSpaces 2.0 has the advantages of allowing continuous spatial sampling, generalization to novel environments, and configurable microphone and material properties. To our best knowledge, this is the first geometry-based acoustic simulation that offers high fidelity and realism while also being fast enough to use for embodied learning. We showcase the simulator's properties and benchmark its performance against real-world audio measurements. In addition, through two downstream tasks covering embodied navigation and far-field automatic speech recognition, highlighting sim2real performance for the latter. SoundSpaces 2.0 is publicly available to facilitate wider research for perceptual systems that can both see and hear.

READ FULL TEXT

page 2

page 6

page 7

research
12/24/2019

Audio-Visual Embodied Navigation

Moving around in the world is naturally a multisensory experience, but t...
research
02/14/2022

Visual Acoustic Matching

We introduce the visual acoustic matching task, in which an audio clip i...
research
12/21/2020

Semantic Audio-Visual Navigation

Recent work on audio-visual navigation assumes a constantly-sounding tar...
research
09/15/2023

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction

Previous Multimodal Information based Speech Processing (MISP) challenge...
research
06/14/2021

Learning Audio-Visual Dereverberation

Reverberation from audio reflecting off surfaces and objects in the envi...
research
06/08/2022

Few-Shot Audio-Visual Learning of Environment Acoustics

Room impulse response (RIR) functions capture how the surrounding physic...
research
08/20/2023

Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation

Audio-visual navigation is an audio-targeted wayfinding task where a rob...

Please sign up or login with your details

Forgot password? Click here to reset