Singer Identification Using Convolutional Acoustic Motif Embeddings

08/01/2020
by   Aitor Arronte-Alvarez, et al.
0

Flamenco singing is characterized by pitch instability, micro-tonal ornamentations, large vibrato ranges, and a high degree of melodic variability. These musical features make the automatic identification of flamenco singers a difficult computational task. In this article we present an end-to-end pipeline for flamenco singer identification based on acoustic motif embeddings. In the approach taken, the fundamental frequency obtained directly from the raw audio signal is approximated. This approximation reduces the high variability of the audio signal and allows for small melodic patterns to be discovered using a sequential pattern mining technique, thus creating a dictionary of motifs. Several acoustic features are then used to extract fixed length embeddings of variable length motifs by using convolutional architectures. We test the quality of the embeddings in a flamenco singer identification task, comparing our approach with previous deep learning architectures, and study the effect of motivic patterns and acoustic features in the identification task. Results indicate that motivic patterns play a crucial role in identifying flamenco singers by minimizing the size of the signal to be learned, discarding information that is not relevant in the identification task. The deep learning architecture presented outperforms denser models used in large-scale audio classification problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2022

SpectNet : End-to-End Audio Signal Classification Using Learnable Spectrograms

Pattern recognition from audio signals is an active research topic encom...
research
04/18/2019

End-to-End Environmental Sound Classification using a 1D Convolutional Neural Network

In this paper, we present an end-to-end approach for environmental sound...
research
09/28/2022

MeWEHV: Mel and Wave Embeddings for Human Voice Tasks

A recent trend in speech processing is the use of embeddings created thr...
research
03/12/2018

Convolutional Neural Networks and Language Embeddings for End-to-End Dialect Recognition

Dialect identification (DID) is a special case of general language ident...
research
03/18/2023

Content Adaptive Front End For Audio Signal Processing

We propose a learnable content adaptive front end for audio signal proce...
research
07/22/2020

To Be or Not To Be a Verbal Multiword Expression: A Quest for Discriminating Features

Automatic identification of mutiword expressions (MWEs) is a pre-requisi...
research
07/12/2016

City-Identification of Flickr Videos Using Semantic Acoustic Features

City-identification of videos aims to determine the likelihood of a vide...

Please sign up or login with your details

Forgot password? Click here to reset