Neural Sound Field Decomposition with Super-resolution of Sound Direction

10/22/2022
by   Qiuqiang Kong, et al.
0

Sound field decomposition predicts waveforms in arbitrary directions using signals from a limited number of microphones as inputs. Sound field decomposition is fundamental to downstream tasks, including source localization, source separation, and spatial audio reproduction. Conventional sound field decomposition methods such as Ambisonics have limited spatial decomposition resolution. This paper proposes a learning-based Neural Sound field Decomposition (NeSD) framework to allow sound field decomposition with fine spatial direction resolution, using recordings from microphone capsules of a few microphones at arbitrary positions. The inputs of a NeSD system include microphone signals, microphone positions, and queried directions. The outputs of a NeSD include the waveform and the presence probability of a queried position. We model the NeSD systems respectively with different neural networks, including fully connected, time delay, and recurrent neural networks. We show that the NeSD systems outperform conventional Ambisonics and DOANet methods in sound field decomposition and source localization on speech, music, and sound events datasets. Demos are available at https://www.youtube.com/watch?v=0GIr6doj3BQ.

READ FULL TEXT

page 6

page 8

research
09/13/2023

Sound field decomposition based on two-stage neural networks

A method for sound field decomposition based on neural networks is propo...
research
06/07/2021

PILOT: Introducing Transformers for Probabilistic Sound Event Localization

Sound event localization aims at estimating the positions of sound sourc...
research
12/17/2018

Quaternion Convolutional Neural Networks for Detection and Localization of 3D Sound Events

Learning from data in the quaternion domain enables us to exploit intern...
research
10/19/2021

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks

The cocktail party problem aims at isolating any source of interest with...
research
06/21/2021

MeshRIR: A Dataset of Room Impulse Responses on Meshed Grid Points For Evaluating Sound Field Analysis and Synthesis Methods

A new impulse response (IR) dataset called "MeshRIR" is introduced. Curr...
research
08/08/2023

Dual input neural networks for positional sound source localization

In many signal processing applications, metadata may be advantageously u...
research
03/04/2018

Multiple Sound Source Localisation with Steered Response Power Density and Hierarchical Grid Refinement

Estimation of the direction-of-arrival (DOA) of sound sources is an impo...

Please sign up or login with your details

Forgot password? Click here to reset