VAE-based regularization for deep speaker embedding

04/07/2019
by   Yang Zhang, et al.
0

Deep speaker embedding has achieved state-of-the-art performance in speaker recognition. A potential problem of these embedded vectors (called `x-vectors') are not Gaussian, causing performance degradation with the famous PLDA back-end scoring. In this paper, we propose a regularization approach based on Variational Auto-Encoder (VAE). This model transforms x-vectors to a latent space where mapped latent codes are more Gaussian, hence more suitable for PLDA scoring.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset