Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

01/06/2019
by   Yuzhong Wu, et al.
0

Acoustic scene classification is the task of identifying the scene from which the audio signal is recorded. Convolutional neural network (CNN) models are widely adopted with proven successes in acoustic scene classification. However, there is little insight on how an audio scene is perceived in CNN, as what have been demonstrated in image recognition research. In the present study, the Class Activation Mapping (CAM) is utilized to analyze how the log-magnitude Mel-scale filter-bank (log-Mel) features of different acoustic scenes are learned in a CNN classifier. It is noted that distinct high-energy time-frequency components of audio signals generally do not correspond to strong activation on CAM, while the background sound texture are well learned in CNN. In order to make the sound texture more salient, we propose to apply the Difference of Gaussian (DoG) and Sobel operator to process the log-Mel features and enhance edge information of the time-frequency image. Experimental results on the DCASE 2017 ASC challenge show that using edge enhanced log-Mel images as input feature of CNN significantly improves the performance of audio scene classification.

READ FULL TEXT

page 3

page 4

research
07/25/2020

DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity Acoustic Scene Classification

This paper presents a Depthwise Disout Convolutional Neural Network (DD-...
research
08/11/2021

Robust Feature Learning on Long-Duration Sounds for Acoustic Scene Classification

Acoustic scene classification (ASC) aims to identify the type of scene (...
research
03/31/2022

1-D CNN based Acoustic Scene Classification via Reducing Layer-wise Dimensionality

This paper presents an alternate representation framework to commonly us...
research
11/29/2016

Learning Filter Banks Using Deep Learning For Acoustic Signals

Designing appropriate features for acoustic event recognition tasks is a...
research
11/30/2020

A proposal and evaluation of new timbre visualisation methods for audio sample browsers

Searching through vast libraries of sound samples can be a daunting and ...
research
10/14/2019

Acoustic Scene Classification Based on a Large-margin Factorized CNN

In this paper, we present an acoustic scene classification framework bas...
research
12/31/2022

Attentional Graph Convolutional Network for Structure-aware Audio-Visual Scene Classification

Audio-Visual scene understanding is a challenging problem due to the uns...

Please sign up or login with your details

Forgot password? Click here to reset