Environmental Sound Classification Based on Multi-temporal Resolution Convolutional Neural Network Combining with Multi-level Features

05/24/2018
by   Boqing Zhu, et al.
0

Motivated by the fact that characteristics of different sound classes are highly diverse in different temporal scales and hierarchical levels, a novel deep convolutional neural network (CNN) architecture is proposed for the environmental sound classification task. This network architecture takes raw waveforms as input, and a set of separated parallel CNNs are utilized with different convolutional filter sizes and strides, in order to learn feature representations with multi-temporal resolutions. On the other hand, the proposed architecture also aggregates hierarchical features from multi-level CNN layers for classification using direct connections between convolutional layers, which is beyond the typical single-level CNN features employed by the majority of previous studies. This network architecture also improves the flow of information and avoids vanishing gradient problem. The combination of multi-level features boosts the classification performance significantly. Comparative experiments are conducted on two datasets: the environmental sound classification dataset (ESC-50), and DCASE 2017 audio scene classification dataset. Results demonstrate that the proposed method is highly effective in the classification tasks by employing multi-temporal resolution and multi-level features, and it outperforms the previous methods which only account for single-level features.

READ FULL TEXT
research
05/24/2018

Environmental Sound Classification Based on Multi-temporal Resolution CNN Network Combining with Multi-level Features

Motivated by the fact that characteristics of different sound classes ar...
research
08/25/2018

Deep Convolutional Neural Network with Mixup for Environmental Sound Classification

Environmental sound classification (ESC) is an important and challenging...
research
01/24/2019

Multi-stream Network With Temporal Attention For Environmental Sound Classification

Environmental sound classification systems often do not perform robustly...
research
11/11/2018

Multi-Temporal Resolution Convolutional Neural Networks for Acoustic Scene Classification

In this paper we present a Deep Neural Network architecture for the task...
research
04/02/2019

Effective Aesthetics Prediction with Multi-level Spatially Pooled Features

We propose an effective deep learning approach to aesthetics quality ass...
research
12/04/2017

Raw Waveform-based Audio Classification Using Sample-level CNN Architectures

Music, speech, and acoustic scene sound are often handled separately in ...
research
07/14/2020

MFRNet: A New CNN Architecture for Post-Processing and In-loop Filtering

In this paper, we propose a novel convolutional neural network (CNN) arc...

Please sign up or login with your details

Forgot password? Click here to reset