Deep Neural Network Embedding Learning with High-Order Statistics for Text-Independent Speaker Verification

03/28/2019
by   Lanhua You, et al.
0

The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. This paper presents a multi-task learning architecture for training the speaker embedding DNN, with the primary task of classifying the target speakers and the auxiliary task of reconstructing the higher-order statistics of the original input utterance. The proposed training strategy aggregates both the supervised and unsupervised learning into one framework to make the speaker embeddings more discriminative and robust. Experiments are carried out in the NIST SRE16 evaluation dataset and the VOiCES dataset. The results demonstrate that our proposed method outperform the original x-vector approach with very low additional complexity added.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2019

Multi-Task Learning with High-Order Statistics for X-vector based Text-Independent Speaker Verification

The x-vector based deep neural network (DNN) embedding systems have demo...
research
07/28/2020

Siamese x-vector reconstruction for domain adapted speaker recognition

With the rise of voice-activated applications, the need for speaker reco...
research
01/14/2020

Gaussian speaker embedding learning for text-independent speaker verification

The x-vector maps segments of arbitrary duration to vectors of fixed dim...
research
03/28/2019

Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification

In this paper, gating mechanisms are applied in deep neural network (DNN...
research
03/20/2020

Improving Embedding Extraction for Speaker Verification with Ladder Network

Speaker verification is an established yet challenging task in speech pr...
research
10/21/2020

Multi-task Metric Learning for Text-independent Speaker Verification

In this work, we introduce metric learning (ML) to enhance the deep embe...
research
03/31/2016

System Combination for Short Utterance Speaker Recognition

For text-independent short-utterance speaker recognition (SUSR), the per...

Please sign up or login with your details

Forgot password? Click here to reset