Polyphonic Sound Event and Sound Activity Detection: A Multi-task approach

07/11/2019
by   Arjun Pankajakshan, et al.
0

Polyphonic Sound Event Detection (SED) in real-world recordings is a challenging task because of the dynamic polyphony level, intensity, and duration of sound events. Current polyphonic SED systems fail to model the temporal structure of sound events explicitly and instead attempt to look at which sound events are present at each audio frame. Consequently, the event-wise detection performance is much lower than the segment-wise detection performance. In this work, we propose a joint model approach to improve the temporal localization of sound events using a multi-task learning setup. The first task predicts which sound events are present at each time frame; we call this branch 'Sound Event Detection (SED) model', while the second task predicts if a sound event is present or not at each frame; we call this branch 'Sound Activity Detection (SAD) model'. We verify the proposed joint model by comparing it with a separate implementation of both tasks aggregated together from individual task predictions. Our experiments on the URBAN-SED dataset show that the proposed joint model can alleviate False Positive (FP) and False Negative (FN) errors and improve both the segment-wise and the event-wise metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2020

Event-Independent Network for Polyphonic Sound Event Localization and Detection

Polyphonic sound event localization and detection is not only detecting ...
research
07/08/2020

Improving Sound Event Detection In Domestic Environments Using Sound Separation

Performing sound event detection on real-world recordings often implies ...
research
01/19/2021

Towards duration robust weakly supervised sound event detection

Sound event detection (SED) is the task of tagging the absence or presen...
research
03/04/2022

Selective Pseudo-labeling and Class-wise Discriminative Fusion for Sound Event Detection

In recent years, exploring effective sound separation (SSep) techniques ...
research
05/05/2020

Temporal Event Segmentation using Attention-based Perceptual Prediction Model for Continual Learning

Temporal event segmentation of a long video into coherent events require...
research
03/27/2018

Event-based Dynamic Face Detection and Tracking Based on Activity

We present the first purely event-based approach for face detection usin...
research
07/23/2021

Automatic Detection Of Noise Events at Shooting Range Using Machine Learning

Outdoor shooting ranges are subject to noise regulations from local and ...

Please sign up or login with your details

Forgot password? Click here to reset