SpectNet : End-to-End Audio Signal Classification Using Learnable Spectrograms

11/17/2022
by   Md. Istiaq Ansari, et al.
0

Pattern recognition from audio signals is an active research topic encompassing audio tagging, acoustic scene classification, music classification, and other areas. Spectrogram and mel-frequency cepstral coefficients (MFCC) are among the most commonly used features for audio signal analysis and classification. Recently, deep convolutional neural networks (CNN) have been successfully used for audio classification problems using spectrogram-based 2D features. In this paper, we present SpectNet, an integrated front-end layer that extracts spectrogram features within a CNN architecture that can be used for audio pattern recognition tasks. The front-end layer utilizes learnable gammatone filters that are initialized using mel-scale filters. The proposed layer outputs a 2D spectrogram image which can be fed into a 2D CNN for classification. The parameters of the entire network, including the front-end filterbank, can be updated via back-propagation. This training scheme allows for fine-tuning the spectrogram-image features according to the target audio dataset. The proposed method is evaluated in two different audio signal classification tasks: heart sound anomaly detection and acoustic scene classification. The proposed method shows a significant 1.02% improvement in MACC for the heart sound classification task and 2.11% improvement in accuracy for the acoustic scene classification task compared to the classical spectrogram image features. The source code of our experiments can be found at <https://github.com/mHealthBuet/SpectNet>

READ FULL TEXT

page 1

page 2

research
12/21/2019

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

Audio pattern recognition is an important research topic in the machine ...
research
08/01/2020

Singer Identification Using Convolutional Acoustic Motif Embeddings

Flamenco singing is characterized by pitch instability, micro-tonal orna...
research
11/29/2018

Deep Haar Scattering Networks in Pattern Recognition: A promising approach

The aim of this paper is to discuss the use of Haar scattering networks,...
research
06/15/2018

Learning Front-end Filter-bank Parameters using Convolutional Neural Networks for Abnormal Heart Sound Detection

Automatic heart sound abnormality detection can play a vital role in the...
research
11/02/2018

Acoustic Features Fusion using Attentive Multi-channel Deep Architecture

In this paper, we present a novel deep fusion architecture for audio cla...
research
03/18/2023

Content Adaptive Front End For Audio Signal Processing

We propose a learnable content adaptive front end for audio signal proce...
research
05/30/2022

AI-enabled Sound Pattern Recognition on Asthma Medication Adherence: Evaluation with the RDA Benchmark Suite

Asthma is a common, usually long-term respiratory disease with negative ...

Please sign up or login with your details

Forgot password? Click here to reset