VAE-based regularization for deep speaker embedding

04/07/2019
by   Yang Zhang, et al.
0

Deep speaker embedding has achieved state-of-the-art performance in speaker recognition. A potential problem of these embedded vectors (called `x-vectors') are not Gaussian, causing performance degradation with the famous PLDA back-end scoring. In this paper, we propose a regularization approach based on Variational Auto-Encoder (VAE). This model transforms x-vectors to a latent space where mapped latent codes are more Gaussian, hence more suitable for PLDA scoring.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2019

VAE-based Domain Adaptation for Speaker Verification

Deep speaker embedding has achieved satisfactory performance in speaker ...
research
05/25/2023

Ordered and Binary Speaker Embedding

Modern speaker recognition systems represent utterances by embedding vec...
research
04/07/2020

Deep Normalization for Speaker Vectors

Deep speaker embedding has demonstrated state-of-the-art performance in ...
research
10/30/2020

Deep Speaker Vector Normalization with Maximum Gaussianality Training

Deep speaker embedding represents the state-of-the-art technique for spe...
research
03/24/2018

Fast variational Bayes for heavy-tailed PLDA applied to i-vectors and x-vectors

The standard state-of-the-art backend for text-independent speaker recog...
research
11/08/2018

Gaussian-Constrained training for speaker verification

Neural models, in particular the d-vector and x-vector architectures, ha...
research
10/31/2022

Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit

Speaker modeling is essential for many related tasks, such as speaker re...

Please sign up or login with your details

Forgot password? Click here to reset