Sound Event Localization and Detection using Squeeze-Excitation Residual CNNs

06/25/2020
by   Javier Naranjo-Alcazar, et al.
0

Sound Event Localization and Detection (SELD) is a problem related to the field of machine listening whose objective is to recognize individual sound events, detect their temporal activity, and estimate their spatial location. Thanks to the emergence of more hard-labeled audio datasets, Deep Learning techniques have become state-of-the-art solutions. The most common ones are those that implement a convolutional recurrent network (CRNN) having previously transformed the audio signal into multichannel 2D representation. The squeeze-excitation technique can be considered as a convolution enhancement that aims to learn spatial and channel feature maps independently rather than together as standard convolutions do. This is usually achieved by combining some global clustering operators, linear operators and a final calibration between the block input and its learned relationships. This work aims to improve the accuracy results of the baseline CRNN presented in DCASE 2020 Task 3 by adding residual squeeze-excitation (SE) blocks in the convolutional part of the CRNN. The followed procedure involves a grid search of the parameter ratio (used in the linear relationships) of the residual SE block, whereas the hyperparameters of the network remain the same as in the baseline. Experiments show that by simply introducing the residual SE blocks, the results obtained clearly exceed the baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
03/20/2020

On the performance of different excitation-residual blocks for Acoustic Scene Classification

Acoustic Scene Classification (ASC) is a problem related to the field of...
research
07/30/2021

TASK3 DCASE2021 Challenge: Sound event localization and detection using squeeze-excitation residual CNNs

Sound event localisation and detection (SELD) is a problem in the field ...
research
03/20/2020

Acoustic Scene Classification with Squeeze-Excitation Residual Networks

Acoustic scene classification (ASC) is a problem related to the field of...
research
08/04/2019

Sound Event Detection in Multichannel Audio using Convolutional Time-Frequency-Channel Squeeze and Excitation

In this study, we introduce a convolutional time-frequency-channel "Sque...
research
07/24/2018

Competitive Inner-Imaging Squeeze and Excitation for Residual Network

Residual Network make the very deep convolutional architecture works wel...
research
06/22/2020

Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net

Our systems submitted to the DCASE2020 task 3: Sound Event Localization ...
research
02/16/2019

RES-SE-NET: Boosting Performance of Resnets by Enhancing Bridge-connections

One of the ways to train deep neural networks effectively is to use resi...

Please sign up or login with your details

Forgot password? Click here to reset