Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks

06/30/2018
by   Sharath Adavanne, et al.
0

In this paper, we propose a convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3D) space. The proposed network takes a sequence of consecutive spectrogram time-frames as input and maps it to two outputs in parallel. As the first output, the sound event detection (SED) is performed as a multi-label multi-class classification task on each time-frame producing temporal activity for all the sound event classes. As the second output, localization is performed by estimating the 3D Cartesian coordinates of the direction-of-arrival (DOA) for each sound event class using multi-class regression. The proposed method is able to associate multiple DOAs with respective sound event labels and further track this association with respect to time. The proposed method uses separately the phase and magnitude component of the spectrogram calculated on each audio channel as the feature, thereby avoiding any method- and array-specific feature extraction. The method is evaluated on five Ambisonic and two circular array format datasets with different overlapping sound events in anechoic, reverberant and real-life scenarios. The proposed method is compared with two SED, three DOA estimation, and one SELD baselines. The results show that the proposed method is generic to array structures, robust to unseen DOA labels, reverberation, and low SNR scenarios. The proposed joint estimation of DOA and SED in comparison to the respective standalone baselines resulted in a consistently higher recall of the estimated number of DOAs across datasets.

READ FULL TEXT
research
05/01/2019

Polyphonic Sound Event Detection and Localization using a Two-Stage Strategy

Sound event detection (SED) and localization refer to recognizing sound ...
research
11/02/2018

Unifying Isolated and Overlapping Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks

We propose a multi-label multi-task framework based on a convolutional r...
research
08/02/2019

Sound source detection, localization and classification using consecutive ensemble of CRNN models

In this paper, we describe our method for DCASE2019 task3: Sound Event L...
research
10/27/2017

Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network

This paper proposes a deep neural network for estimating the directions ...
research
02/06/2021

Sound Event Detection in Urban Audio With Single and Multi-Rate PCEN

Recent literature has demonstrated that the use of per-channel energy no...
research
07/13/2022

Polyphonic sound event detection for highly dense birdsong scenes

One hour before sunrise, one can experience the dawn chorus where birds ...
research
05/21/2019

A multi-room reverberant dataset for sound event localization and detection

This paper presents the sound event localization and detection (SELD) ta...

Please sign up or login with your details

Forgot password? Click here to reset