Combining High-Level Features of Raw Audio Waves and Mel-Spectrograms for Audio Tagging

11/26/2018
by   Marcel Lederle, et al.
0

In this paper, we describe our contribution to Task 2 of the DCASE 2018 Audio Challenge. While it has become ubiquitous to utilize an ensemble of machine learning methods for classification tasks to obtain better predictive performance, the majority of ensemble methods combine predictions rather than learned features. We propose a single-model method that combines learned high-level features computed from log-scaled mel-spectrograms and raw audio data. These features are learned separately by two Convolutional Neural Networks, one for each input type, and then combined by densely connected layers within a single network. This relatively simple approach along with data augmentation ranks among the best two percent in the Freesound General-Purpose Audio Tagging Challenge on Kaggle.

READ FULL TEXT

page 3

page 4

research
02/21/2017

Mimicking Ensemble Learning with Deep Branched Networks

This paper proposes a branched residual network for image classification...
research
10/30/2018

General audio tagging with ensembling convolutional neural network and statistical features

Audio tagging aims to infer descriptive labels from audio clips. Audio t...
research
04/07/2017

Jet Constituents for Deep Neural Network Based Top Quark Tagging

Recent literature on deep neural networks for tagging of highly energeti...
research
07/08/2018

Densely Connected CNNs for Bird Audio Detection

Detecting bird sounds in audio recordings automatically, if accurate eno...
research
07/26/2018

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

This paper describes Task 2 of the DCASE 2018 Challenge, titled "General...
research
03/03/2023

Low-Complexity Audio Embedding Extractors

Solving tasks such as speaker recognition, music classification, or sema...
research
04/11/2023

Audio Bank: A High-Level Acoustic Signal Representation for Audio Event Recognition

Automatic audio event recognition plays a pivotal role in making human r...

Please sign up or login with your details

Forgot password? Click here to reset