Multichannel Sound Event Detection Using 3D Convolutional Neural Networks for Learning Inter-channel Features

01/29/2018
by   Sharath Adavanne, et al.
0

In this paper, we propose a stacked convolutional and recurrent neural network (CRNN) with a 3D convolutional neural network (CNN) in the first layer for the multichannel sound event detection (SED) task. The 3D CNN enables the network to simultaneously learn the inter- and intra-channel features from the input multichannel audio. In order to evaluate the proposed method, multichannel audio datasets with different number of overlapping sound sources are synthesized. Each of this dataset has a four-channel first-order Ambisonic, binaural, and single-channel versions, on which the performance of SED using the proposed method are compared to study the potential of SED using multichannel audio. A similar study is also done with the binaural and single-channel versions of the real-life recording TUT-SED 2017 development dataset. The proposed method learns to recognize overlapping sound events from multichannel features faster and performs better SED with a fewer number of training epochs. The results show that on using multichannel Ambisonic audio in place of single-channel audio we improve the overall F-score by 7.5 error rate by 10 four overlapping sound sources.

READ FULL TEXT
research
10/09/2017

A report on sound event detection with different binaural features

In this paper, we compare the performance of using binaural audio featur...
research
10/09/2017

Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network

This paper proposes a neural network architecture and training scheme to...
research
02/06/2021

Sound Event Detection in Urban Audio With Single and Multi-Rate PCEN

Recent literature has demonstrated that the use of per-channel energy no...
research
03/07/2017

Convolutional Recurrent Neural Networks for Bird Audio Detection

Bird sounds possess distinctive spectral structure which may exhibit sma...
research
01/13/2020

Two Channel Audio Zooming System For Smartphone

In this paper, two microphone based systems for audio zooming is propose...
research
08/20/2018

R-CRNN: Region-based Convolutional Recurrent Neural Network for Audio Event Detection

This paper proposes a Region-based Convolutional Recurrent Neural Networ...
research
08/17/2021

Neonatal Bowel Sound Detection Using Convolutional Neural Network and Laplace Hidden Semi-Markov Model

Abdominal auscultation is a convenient, safe and inexpensive method to a...

Please sign up or login with your details

Forgot password? Click here to reset