Impact of temporal resolution on convolutional recurrent networks for audio tagging and sound event detection

09/26/2022
by   Wim Boes, et al.
0

Many state-of-the-art systems for audio tagging and sound event detection employ convolutional recurrent neural architectures. Typically, they are trained in a mean teacher setting to deal with the heterogeneous annotation of the available data. In this work, we present a thorough analysis of how changing the temporal resolution of these convolutional recurrent neural networks - which can be done by simply adapting their pooling operations - impacts their performance. By using a variety of evaluation metrics, we investigate the effects of adapting this design parameter under several sound recognition scenarios involving different needs in terms of temporal localization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2022

Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection

In this technical report, the systems we submitted for subtask 4 of the ...
research
04/11/2019

Cross-task learning for audio tagging, sound event detection spatial localization: DCASE 2019 baseline systems

The Detection and Classification of Acoustic Scenes and Events (DCASE) 2...
research
03/03/2020

SELD-TCN: Sound Event Localization Detection via Temporal Convolutional Networks

The understanding of the surrounding environment plays a critical role i...
research
10/22/2018

A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling

Sound event detection (SED) entails two subtasks: recognizing what types...
research
06/07/2021

PILOT: Introducing Transformers for Probabilistic Sound Event Localization

Sound event localization aims at estimating the positions of sound sourc...
research
07/09/2020

Low Cost Gunshot Detection using Deep Learning on the Raspberry Pi

Many cities using gunshot detection technology depend on expensive syste...
research
07/30/2021

TASK3 DCASE2021 Challenge: Sound event localization and detection using squeeze-excitation residual CNNs

Sound event localisation and detection (SELD) is a problem in the field ...

Please sign up or login with your details

Forgot password? Click here to reset