CN-Celeb: multi-genre speaker recognition

12/23/2020
by   Lantian Li, et al.
0

Research on speaker recognition is extending to address the vulnerability in the wild conditions, among which genre mismatch is perhaps the most challenging, for instance, enrollment with reading speech while testing with conversational or singing audio. This mismatch leads to complex and composite inter-session variations, both intrinsic (i.e., speaking style, physiological status) and extrinsic (i.e., recording device, background noise). Unfortunately, the few existing multi-genre corpora are not only limited in size but are also recorded under controlled conditions, which cannot support conclusive research on the multi-genre problem. In this work, we firstly publish CN-Celeb, a large-scale multi-genre corpus that includes in-the-wild speech utterances of 3,000 speakers in 11 different genres. Secondly, using this dataset, we conduct a comprehensive study on the multi-genre phenomenon, in particular the impact of the multi-genre challenge on speaker recognition, and on how to utilize the valuable multi-genre data more efficiently.

READ FULL TEXT

page 7

page 10

08/16/2021

NIST SRE CTS Superset: A large-scale dataset for telephony speaker recognition

This document provides a brief description of the National Institute of ...
04/01/2022

Speaker verification in mismatch training and testing conditions

This paper presents an exhaustive study about the robustness of several ...
12/23/2020

A Principle Solution for Enroll-Test Mismatch in Speaker Recognition

Mismatch between enrollment and test conditions causes serious performan...
04/15/2019

Synthesising 3D Facial Motion from "In-the-Wild" Speech

Synthesising 3D facial motion from speech is a crucial problem manifesti...
10/31/2019

CN-CELEB: a challenging Chinese speaker recognition dataset

Recently, researchers set an ambitious goal of conducting speaker recogn...
02/23/2022

Speaker recognition improvement using blind inversion of distortions

In this paper we propose the inversion of nonlinear distortions in order...
12/02/2019

Speaker detection in the wild: Lessons learned from JSALT 2019

This paper presents the problems and solutions addressed at the JSALT wo...