Sample Dropout for Audio Scene Classification Using Multi-Scale Dense Connected Convolutional Neural Network

06/12/2018
by   Dawei Feng, et al.
0

Acoustic scene classification is an intricate problem for a machine. As an emerging field of research, deep Convolutional Neural Networks (CNN) achieve convincing results. In this paper, we explore the use of multi-scale Dense connected convolutional neural network (DenseNet) for the classification task, with the goal to improve the classification performance as multi-scale features can be extracted from the time-frequency representation of the audio signal. On the other hand, most of previous CNN-based audio scene classification approaches aim to improve the classification accuracy, by employing different regularization techniques, such as the dropout of hidden units and data augmentation, to reduce overfitting. It is widely known that outliers in the training set have a high negative influence on the trained model, and culling the outliers may improve the classification performance, while it is often under-explored in previous studies. In this paper, inspired by the silence removal in the speech signal processing, a novel sample dropout approach is proposed, which aims to remove outliers in the training dataset. Using the DCASE 2017 audio scene classification datasets, the experimental results demonstrates the proposed multi-scale DenseNet providing a superior performance than the traditional single-scale DenseNet, while the sample dropout method can further improve the classification robustness of multi-scale DenseNet.

READ FULL TEXT
research
05/18/2018

Mixup-Based Acoustic Scene Classification Using Multi-Channel Convolutional Neural Network

Audio scene classification, the problem of predicting class labels of au...
research
08/06/2021

SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain Features

A mixed sample data augmentation strategy is proposed to enhance the per...
research
02/26/2019

Acoustic scene classification using multi-layer temporal pooling based on convolutional neural network

The temporal dynamics and the discriminative information in the audio si...
research
02/07/2022

DeepSSN: a deep convolutional neural network to assess spatial scene similarity

Spatial-query-by-sketch is an intuitive tool to explore human spatial kn...
research
06/09/2023

Domestic Activities Classification from Audio Recordings Using Multi-scale Dilated Depthwise Separable Convolutional Network

Domestic activities classification (DAC) from audio recordings aims at c...
research
05/24/2018

Multi-Scale DenseNet-Based Electricity Theft Detection

Electricity theft detection issue has drawn lots of attention during las...
research
11/01/2017

Reducing Model Complexity for DNN Based Large-Scale Audio Classification

Audio classification is the task of identifying the sound categories tha...

Please sign up or login with your details

Forgot password? Click here to reset