Unifying Isolated and Overlapping Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks

11/02/2018
by   Oliver Y. Chén, et al.
2

We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlapping audio events. The framework leverages the power of convolutional recurrent neural network architectures; convolutional layers learn effective features over which higher recurrent layers perform sequential modelling. Furthermore, the output layer is designed to handle arbitrary degrees of event overlap. At each time step in the recurrent output sequence, an output triple is dedicated to each event category of interest to jointly model event occurrence and temporal boundaries. That is, the network jointly determines whether an event of this category occurs, and when it occurs, by estimating onset and offset positions at each recurrent time step. We then introduce three sequential losses for network training: multi-label classification loss, distance estimation loss, and confidence loss. We demonstrate good generalization on two datasets: ITC-Irst for isolated audio event detection, and TUT-SED-Synthetic-2016 for overlapping audio event detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
06/30/2018

Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks

In this paper, we propose a convolutional recurrent neural network for j...
research
08/20/2018

R-CRNN: Region-based Convolutional Recurrent Neural Network for Audio Event Detection

This paper proposes a Region-based Convolutional Recurrent Neural Networ...
research
10/09/2017

Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network

This paper proposes a neural network architecture and training scheme to...
research
03/07/2017

Convolutional Recurrent Neural Networks for Bird Audio Detection

Bird sounds possess distinctive spectral structure which may exhibit sma...
research
07/08/2016

CaR-FOREST: Joint Classification-Regression Decision Forests for Overlapping Audio Event Detection

This report describes our submissions to Task2 and Task3 of the DCASE 20...
research
12/06/2017

Enabling Early Audio Event Detection with Neural Networks

This paper presents a methodology for early detection of audio events fr...
research
09/28/2018

SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging

Automatic sleep staging has been often treated as a simple classificatio...

Please sign up or login with your details

Forgot password? Click here to reset