Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings

In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single multilabel BLSTM RNN is trained to map acoustic features of a mixture signal consisting of sounds from multiple classes, to binary activity indicators of each event class. Our method is tested on a large database of real-life recordings, with 61 classes (e.g. music, car, speech) from 10 different everyday contexts. The proposed method outperforms previous approaches by a large margin, and the results are further improved using data augmentation techniques. Overall, our system reports an average F1-score of 65.5 relative improvement over previous state-of-the-art approach of 6.8 respectively.

READ FULL TEXT
research
07/22/2016

Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording

In this paper we present our work on Task 1 Acoustic Scene Classi- ficat...
research
07/19/2019

Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling

A sound event detection (SED) method typically takes as an input a seque...
research
07/23/2018

Automatic Speech Recognition for Humanitarian Applications in Somali

We present our first efforts in building an automatic speech recognition...
research
05/09/2020

Temporal-Framing Adaptive Network for Heart Sound Segmentation without Prior Knowledge of State Duration

Objective: This paper presents a novel heart sound segmentation algorith...
research
12/05/2020

Bidirectional recurrent neural networks for seismic event detection

Real time, accurate passive seismic event detection is a critical safety...
research
07/19/2018

A Capsule based Approach for Polyphonic Sound Event Detection

Polyphonic sound event detection (polyphonic SED) is an interesting but ...
research
12/09/2012

High-dimensional sequence transduction

We investigate the problem of transforming an input sequence into a high...

Please sign up or login with your details

Forgot password? Click here to reset