Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning

03/18/2016
by   Lars Hertel, et al.
0

Recognizing acoustic events is an intricate problem for a machine and an emerging field of research. Deep neural networks achieve convincing results and are currently the state-of-the-art approach for many tasks. One advantage is their implicit feature learning, opposite to an explicit feature extraction of the input signal. In this work, we analyzed whether more discriminative features can be learned from either the time-domain or the frequency-domain representation of the audio signal. For this purpose, we trained multiple deep networks with different architectures on the Freiburg-106 and ESC-10 datasets. Our results show that feature learning from the frequency domain is superior to the time domain. Moreover, additionally using convolution and pooling layers, to explore local structures of the audio signal, significantly improves the recognition performance and achieves state-of-the-art results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2018

End-to-End Polyphonic Sound Event Detection Using Convolutional Recurrent Neural Networks with Learned Time-Frequency Representation Input

Sound event detection systems typically consist of two stages: extractin...
research
06/28/2020

Frequency learning for image classification

Machine learning applied to computer vision and signal processing is ach...
research
05/15/2017

Mosquito Detection with Neural Networks: The Buzz of Deep Learning

Many real-world time-series analysis problems are characterised by scarc...
research
01/08/2022

A novel audio representation using space filling curves

Since convolutional neural networks (CNNs) have revolutionized the image...
research
12/11/2017

Unsupervised Feature Learning for Audio Analysis

Identifying acoustic events from a continuously streaming audio source i...
research
07/16/2019

Machine learning without a feature set for detecting bursts in the EEG of preterm infants

Deep neural networks enable learning directly on the data without the do...
research
02/14/2020

Acoustic Scene Classification Using Bilinear Pooling on Time-liked and Frequency-liked Convolution Neural Network

The current methodology in tackling Acoustic Scene Classification (ASC) ...

Please sign up or login with your details

Forgot password? Click here to reset