Deep convolutional networks on the pitch spiral for musical instrument recognition

05/21/2016
by   Vincent Lostanlen, et al.
0

Musical performance combines a wide range of pitches, nuances, and expressive techniques. Audio-based classification of musical instruments thus requires to build signal representations that are invariant to such transformations. This article investigates the construction of learned convolutional architectures for instrument recognition, given a limited amount of annotated training data. In this context, we benchmark three different weight sharing strategies for deep convolutional networks in the time-frequency domain: temporal kernels; time-frequency kernels; and a linear combination of time-frequency kernels which are one octave apart, akin to a Shepard pitch spiral. We provide an acoustical interpretation of these strategies within the source-filter framework of quasi-harmonic sounds with a fixed spectral envelope, which are archetypal of musical notes. The best classification accuracy is obtained by hybridizing all three convolutional layers into a single deep learning architecture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2017

Musical Instrument Recognition Using Their Distinctive Characteristics in Artificial Neural Networks

In this study an Artificial Neural Network was trained to classify music...
research
09/25/2008

Audio Classification from Time-Frequency Texture

Time-frequency representations of audio signals often resemble texture i...
research
02/13/2021

Deep Convolutional and Recurrent Networks for Polyphonic Instrument Classification from Monophonic Raw Audio Waveforms

Sound Event Detection and Audio Classification tasks are traditionally a...
research
01/18/2023

An investigation of the reconstruction capacity of stacked convolutional autoencoders for log-mel-spectrograms

In audio processing applications, the generation of expressive sounds ba...
research
10/01/2020

Helicality: An Isomap-based Measure of Octave Equivalence in Audio Data

Octave equivalence serves as domain-knowledge in MIR systems, including ...
research
11/22/2018

TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer

In this work, we address the problem of musical timbre transfer, where t...
research
03/26/2019

Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters

In this article we explore how the different semantics of spectrograms' ...

Please sign up or login with your details

Forgot password? Click here to reset