Audio Concept Classification with Hierarchical Deep Neural Networks

10/11/2017
by   Mirco Ravanelli, et al.
0

Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., audio concepts (such as music, laughter, or a revving engine). Conventional Gaussian-Mixture-Models have had some success in classifying a reduced set of audio concepts. However, multi-class classification can benefit from context window analysis and the discriminating power of deeper architectures. Although deep learning has shown promise in various applications such as speech and object recognition, it has not yet met the expectations for other fields such as audio concept classification. This paper explores, for the first time, the potential of deep learning in classifying audio concepts on User-Generated Content videos. The proposed system is comprised of two cascaded neural networks in a hierarchical configuration to analyze the short- and long-term context information. Our system outperforms a GMM approach by a relative 54 and a Deep Neural Network by 12

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2019

Audio-Based Music Classification with DenseNet And Data Augmentation

In recent years, deep learning technique has received intense attention ...
research
02/01/2020

Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis

This thesis combines audio-analysis with computer vision to approach Mus...
research
01/15/2020

Deep Learning for MIR Tutorial

Deep Learning has become state of the art in visual computing and contin...
research
05/01/2018

Randomly weighted CNNs for (music) audio classification

The computer vision literature shows that randomly weighted neural netwo...
research
04/10/2019

Neuralogram: A Deep Neural Network Based Representation for Audio Signals

We propose the Neuralogram – a deep neural network based representation ...
research
07/17/2019

Differentiable Disentanglement Filter: an Application Agnostic Core Concept Discovery Probe

It has long been speculated that deep neural networks function by discov...
research
07/20/2021

PERSA+: A Deep Learning Front-End for Context-Agnostic Audio Classification

Deep learning has been applied to diverse audio semantics tasks, enablin...

Please sign up or login with your details

Forgot password? Click here to reset