PILOT: Introducing Transformers for Probabilistic Sound Event Localization

06/07/2021
by   Christopher Schymura, et al.
0

Sound event localization aims at estimating the positions of sound sources in the environment with respect to an acoustic receiver (e.g. a microphone array). Recent advances in this domain most prominently focused on utilizing deep recurrent neural networks. Inspired by the success of transformer architectures as a suitable alternative to classical recurrent neural networks, this paper introduces a novel transformer-based sound event localization framework, where temporal dependencies in the received multi-channel audio signals are captured via self-attention mechanisms. Additionally, the estimated sound event positions are represented as multivariate Gaussian variables, yielding an additional notion of uncertainty, which many previously proposed deep learning-based systems designed for this application do not provide. The framework is evaluated on three publicly available multi-source sound event localization datasets and compared against state-of-the-art methods in terms of localization error and event detection accuracy. It outperforms all competing systems on all datasets with statistical significant differences in performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2021

Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization

Sound event localization frameworks based on deep neural networks have s...
research
03/03/2020

SELD-TCN: Sound Event Localization Detection via Temporal Convolutional Networks

The understanding of the surrounding environment plays a critical role i...
research
07/20/2021

Assessment of Self-Attention on Learned Features For Sound Event Localization and Detection

Joint sound event localization and detection (SELD) is an emerging audio...
research
10/22/2022

Neural Sound Field Decomposition with Super-resolution of Sound Direction

Sound field decomposition predicts waveforms in arbitrary directions usi...
research
07/08/2022

BAST: Binaural Audio Spectrogram Transformer for Binaural Sound Localization

Accurate sound localization in a reverberation environment is essential ...
research
10/12/2022

Enemy Spotted: in-game gun sound dataset for gunshot classification and localization

Recently, deep learning-based methods have drawn huge attention due to t...
research
09/26/2022

Impact of temporal resolution on convolutional recurrent networks for audio tagging and sound event detection

Many state-of-the-art systems for audio tagging and sound event detectio...

Please sign up or login with your details

Forgot password? Click here to reset