Partial AUC optimization based deep speaker embeddings with class-center learning for text-independent speaker verification

11/19/2019
by   Zhongxin Bai, et al.
0

Deep embedding based text-independent speaker verification has demonstrated superior performance to traditional methods in many challenging scenarios. Its loss functions can be generally categorized into two classes, i.e., verification and identification. The verification loss functions match the pipeline of speaker verification, but their implementations are difficult. Thus, most state-of-the-art deep embedding methods use the identification loss functions with softmax output units or their variants. In this paper, we propose a verification loss function, named the maximization of partial area under the Receiver-operating-characteristic (ROC) curve (pAUC), for deep embedding based text-independent speaker verification. We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance. Experiments on the Speaker in the Wild (SITW) and NIST SRE 2016 datasets show that the proposed pAUC loss function is highly competitive with the state-of-the-art identification loss functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
07/22/2018

Unified Hypersphere Embedding for Speaker Recognition

Incremental improvements in accuracy of Convolutional Neural Networks ar...
research
02/07/2019

End-to-end losses based on speaker basis vectors and all-speaker hard negative mining for speaker verification

In recent years, speaker verification has been primarily performed using...
research
03/05/2021

Harnessing Geometric Constraints from Emotion Labels to improve Face Verification

For the task of face verification, we explore the utility of harnessing ...
research
08/25/2020

Few Shot Text-Independent speaker verification using 3D-CNN

Facial recognition system is one of the major successes of Artificial in...
research
10/31/2022

Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit

Speaker modeling is essential for many related tasks, such as speaker re...
research
03/27/2022

Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices

The area under the ROC curve (AUROC) has been vigorously applied for imb...
research
06/19/2019

Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification

In this paper, we propose a new pooling method called spatial pyramid en...

Please sign up or login with your details

Forgot password? Click here to reset