Deep Speaker Vector Normalization with Maximum Gaussianality Training

10/30/2020
by   Yunqi Cai, et al.
2

Deep speaker embedding represents the state-of-the-art technique for speaker recognition. A key problem with this approach is that the resulting deep speaker vectors tend to be irregularly distributed. In previous research, we proposed a deep normalization approach based on a new discriminative normalization flow (DNF) model, by which the distributions of individual speakers are arguably transformed to homogeneous Gaussians. This normalization was demonstrated to be effective, but despite this remarkable success, we empirically found that the latent codes produced by the DNF model are generally neither homogeneous nor Gaussian, although the model has assumed so. In this paper, we argue that this problem is largely attributed to the maximum-likelihood (ML) training criterion of the DNF model, which aims to maximize the likelihood of the observations but not necessarily improve the Gaussianality of the latent codes. We therefore propose a new Maximum Gaussianality (MG) training approach that directly maximizes the Gaussianality of the latent codes. Our experiments on two data sets, SITW and CNCeleb, demonstrate that our new MG training approach can deliver much better performance than the previous ML training, and exhibits improved domain generalizability, particularly with regard to cosine scoring.

READ FULL TEXT
research
04/07/2020

Deep Normalization for Speaker Vectors

Deep speaker embedding has demonstrated state-of-the-art performance in ...
research
04/07/2019

VAE-based regularization for deep speaker embedding

Deep speaker embedding has achieved state-of-the-art performance in spea...
research
02/22/2016

Blind score normalization method for PLDA based speaker recognition

Probabilistic Linear Discriminant Analysis (PLDA) has become state-of-th...
research
08/27/2019

VAE-based Domain Adaptation for Speaker Verification

Deep speaker embedding has achieved satisfactory performance in speaker ...
research
11/08/2018

Gaussian-Constrained training for speaker verification

Neural models, in particular the d-vector and x-vector architectures, ha...
research
10/19/2016

A Bayesian Approach to Estimation of Speaker Normalization Parameters

In this work, a Bayesian approach to speaker normalization is proposed t...
research
07/15/2020

Cross-Lingual Speaker Verification with Domain-Balanced Hard Prototype Mining and Language-Dependent Score Normalization

In this paper we describe the top-scoring IDLab submission for the text-...

Please sign up or login with your details

Forgot password? Click here to reset