Xi-Vector Embedding for Speaker Recognition

08/12/2021
by   Kong Aik Lee, et al.
0

We present a Bayesian formulation for deep speaker embedding, wherein the xi-vector is the Bayesian counterpart of the x-vector, taking into account the uncertainty estimate. On the technology front, we offer a simple and straightforward extension to the now widely used x-vector. It consists of an auxiliary neural net predicting the frame-wise uncertainty of the input sequence. We show that the proposed extension leads to substantial improvement across all operating points, with a significant reduction in error rates and detection cost. On the theoretical front, our proposal integrates the Bayesian formulation of linear Gaussian model to speaker-embedding neural networks via the pooling layer. In one sense, our proposal integrates the Bayesian formulation of the i-vector to that of the x-vector. Hence, we refer to the embedding as the xi-vector, which is pronounced as /zai/ vector. Experimental results on the SITW evaluation set show a consistent improvement of over 17.5 in equal-error-rate and 10.9

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2018

Attention Mechanism in Speaker Recognition: What Does It Learn in Deep Speaker Embedding?

This paper presents an experimental study on deep speaker embedding with...
research
02/21/2019

Deep Speaker Embedding Learning with Multi-Level Pooling for Text-Independent Speaker Verification

This paper aims to improve the widely used deep speaker embedding x-vect...
research
08/11/2020

Compact Speaker Embedding: lrx-vector

Deep neural networks (DNN) have recently been widely used in speaker rec...
research
10/08/2021

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

In this paper, we propose TitaNet, a novel neural network architecture f...
research
01/14/2020

An Improved Deep Neural Network for Modeling Speaker Characteristics at Different Temporal Scales

This paper presents an improved deep embedding learning method based on ...
research
09/17/2023

Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture

We propose a novel neural speaker diarization system using memory-aware ...
research
02/15/2019

An improved uncertainty propagation method for robust i-vector based speaker recognition

The performance of automatic speaker recognition systems degrades when f...

Please sign up or login with your details

Forgot password? Click here to reset