Deep Convolutional and Recurrent Networks for Polyphonic Instrument Classification from Monophonic Raw Audio Waveforms

02/13/2021
by   Kleanthis Avramidis, et al.
0

Sound Event Detection and Audio Classification tasks are traditionally addressed through time-frequency representations of audio signals such as spectrograms. However, the emergence of deep neural networks as efficient feature extractors has enabled the direct use of audio signals for classification purposes. In this paper, we attempt to recognize musical instruments in polyphonic audio by only feeding their raw waveforms into deep learning models. Various recurrent and convolutional architectures incorporating residual connections are examined and parameterized in order to build end-to-end classi-fiers with low computational cost and only minimal preprocessing. We obtain competitive classification scores and useful instrument-wise insight through the IRMAS test set, utilizing a parallel CNN-BiGRU model with multiple residual connections, while maintaining a significantly reduced number of trainable parameters.

READ FULL TEXT

page 1

page 4

research
06/26/2019

On the performance of residual block design alternatives in convolutional neural networks for end-to-end audio classification

Residual learning is a recently proposed learning framework to facilitat...
research
05/21/2016

Deep convolutional networks on the pitch spiral for musical instrument recognition

Musical performance combines a wide range of pitches, nuances, and expre...
research
07/24/2021

Use of speaker recognition approaches for learning timbre representations of musical instrument sounds from raw waveforms

Timbre representations of musical instruments, essential for diverse app...
research
04/25/2022

End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network

While efficient architectures and a plethora of augmentations for end-to...
research
06/16/2019

Audio Transport: A Generalized Portamento via Optimal Transport

This paper proposes a new method to interpolate between two audio signal...
research
03/30/2018

Conditional End-to-End Audio Transforms

We present an end-to-end method for transforming audio from one style to...
research
10/06/2021

An Investigation of the Effectiveness of Phase for Audio Classification

While log-amplitude mel-spectrogram has widely been used as the feature ...

Please sign up or login with your details

Forgot password? Click here to reset