Acoustic scene classification using multi-layer temporal pooling based on convolutional neural network

02/26/2019
by   Liwen Zhang, et al.
0

The temporal dynamics and the discriminative information in the audio signals are very crucial for the Acoustic Scene Classification (ASC). In this work, we propose a temporal feature learning method with hierarchical architecture called Multi-Layer Temporal Pooling (MLTP). Via recursive non-linear feature mappings and temporal pooling operations, our proposed MLTP can effectively capture the high-level temporal dynamics for an entire audio signal with arbitrary duration in an unsupervised way. With the patch-level discriminative features extracted by a simple pre-trained convolutional neural network (CNN) as input, our method attempts to learn the temporal features for the entire audio sample which will be directly used to train the classifier. Experimental results show that our method significantly improves the ASC performance. Without using any data augmentation techniques or ensemble strategies, our method can still achieve the state of art performance with only one lightweight CNN and a single classifier.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2019

Spatio-Temporal Attention Pooling for Audio Scene Classification

Acoustic scenes are rich and redundant in their content. In this work, w...
research
07/25/2020

DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity Acoustic Scene Classification

This paper presents a Depthwise Disout Convolutional Neural Network (DD-...
research
06/12/2018

Sample Dropout for Audio Scene Classification Using Multi-Scale Dense Connected Convolutional Neural Network

Acoustic scene classification is an intricate problem for a machine. As ...
research
10/03/2022

Simple Pooling Front-ends For Efficient Audio Classification

Recently, there has been increasing interest in building efficient audio...
research
10/27/2022

A knowledge-driven vowel-based approach of depression classification from speech using data augmentation

We propose a novel explainable machine learning (ML) model that identifi...
research
06/09/2023

Acoustic Scene Clustering Using Joint Optimization of Deep Embedding Learning and Clustering Iteration

Recent efforts have been made on acoustic scene classification in the au...
research
12/11/2017

Unsupervised Feature Learning for Audio Analysis

Identifying acoustic events from a continuously streaming audio source i...

Please sign up or login with your details

Forgot password? Click here to reset