JukeBox: A Multilingual Singer Recognition Dataset

08/08/2020
by   Anurag Chowdhury, et al.
0

A text-independent speaker recognition system relies on successfully encoding speech factors such as vocal pitch, intensity, and timbre to achieve good performance. A majority of such systems are trained and evaluated using spoken voice or everyday conversational voice data. Spoken voice, however, exhibits a limited range of possible speaker dynamics, thus constraining the utility of the derived speaker recognition models. Singing voice, on the other hand, covers a broader range of vocal and ambient factors and can, therefore, be used to evaluate the robustness of a speaker recognition system. However, a majority of existing speaker recognition datasets only focus on the spoken voice. In comparison, there is a significant shortage of labeled singing voice data suitable for speaker recognition research. To address this issue, we assemble JukeBox - a speaker recognition dataset with multilingual singing voice audio annotated with singer identity, gender, and language labels. We use the current state-of-the-art methods to demonstrate the difficulty of performing speaker recognition on singing voice using models trained on spoken voice alone. We also evaluate the effect of gender and language on speaker recognition performance, both in spoken and singing voice data. The complete JukeBox dataset can be accessed at http://iprobe.cse.msu.edu/datasets/jukebox.html.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2023

Using Deepfake Technologies for Word Emphasis Detection

In this work, we consider the task of automated emphasis detection for s...
research
04/14/2021

Look at Me When I Talk to You: A Video Dataset to Enable Voice Assistants to Recognize Errors

People interacting with voice assistants are often frustrated by voice a...
research
02/20/2023

Towards Measuring and Scoring Speaker Diarization Fairness

Speaker diarization, or the task of finding "who spoke and when", is now...
research
02/25/2020

Speech2Phone: A Multilingual and Text Independent Speaker Identification Model

Voice recognition is an area with a wide application potential. Speaker ...
research
08/20/2020

asya: Mindful verbal communication using deep learning

asya is a mobile application that consists of deep learning models which...
research
04/28/2020

Cross-modal Speaker Verification and Recognition: A Multilingual Perspective

Recent years have seen a surge in finding association between faces and ...
research
02/26/2023

I-MSV 2022: Indic-Multilingual and Multi-sensor Speaker Verification Challenge

Speaker Verification (SV) is a task to verify the claimed identity of th...

Please sign up or login with your details

Forgot password? Click here to reset