A Case Study of Deep-Learned Activations via Hand-Crafted Audio Features

07/03/2019
by   Olga Slizovskaia, et al.
0

The explainability of Convolutional Neural Networks (CNNs) is a particularly challenging task in all areas of application, and it is notably under-researched in music and audio domain. In this paper, we approach explainability by exploiting the knowledge we have on hand-crafted audio features. Our study focuses on a well-defined MIR task, the recognition of musical instruments from user-generated music recordings. We compute the similarity between a set of traditional audio features and representations learned by CNNs. We also propose a technique for measuring the similarity between activation maps and audio features which typically presented in the form of a matrix, such as chromagrams or spectrograms. We observe that some neurons' activations correspond to well-known classical audio features. In particular, for shallow layers, we found similarities between activations and harmonic and percussive components of the spectrum. For deeper layers, we compare chromagrams with high-level activation maps as well as loudness and onset rate with deep-learned embeddings.

READ FULL TEXT
research
05/31/2021

Towards Explainable Convolutional Features for Music Audio Modeling

Audio signals are often represented as spectrograms and treated as 2D im...
research
04/03/2018

Music Genre Classification using Machine Learning Techniques

Categorizing music files according to their genre is a challenging task ...
research
11/17/2015

Automatic Instrument Recognition in Polyphonic Music Using Convolutional Neural Networks

Traditional methods to tackle many music information retrieval tasks typ...
research
06/29/2017

Audio Spectrogram Representations for Processing with Convolutional Neural Networks

One of the decisions that arise when designing a neural network for any ...
research
06/29/2017

Transforming Musical Signals through a Genre Classifying Convolutional Neural Network

Convolutional neural networks (CNNs) have been successfully applied on b...
research
06/15/2020

COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations

Audio representation learning based on deep neural networks (DNNs) emerg...
research
02/09/2018

Predicting Audio Advertisement Quality

Online audio advertising is a particular form of advertising used abunda...

Please sign up or login with your details

Forgot password? Click here to reset