Neural i-vectors

04/03/2020
by   Ville Vestman, et al.
0

Deep speaker embeddings have been demonstrated to outperform their generative counterparts, i-vectors, in recent speaker verification evaluations. To combine the benefits of high performance and generative interpretation, we investigate the use of deep embedding extractor and i-vector extractor in succession. To bundle the deep embedding extractor with an i-vector extractor, we adopt aggregation layers inspired by the Gaussian mixture model (GMM) to the embedding extractor networks. The inclusion of GMM-like layer allows the discriminatively trained network to be used as a provider of sufficient statistics for the i-vector extractor to extract what we call neural i-vectors. We compare the deep embeddings to the proposed neural i-vectors on the Speakers in the Wild (SITW) and the Speaker Recognition Evaluation (SRE) 2018 and 2019 datasets. On the core-core condition of SITW, our deep embeddings obtain performance comparative to the state-of-the-art. The neural i-vectors obtain about 50 outperform the previous i-vector approaches reported in the literature by a clear margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2018

Discriminatively Re-trained i-vector Extractor for Speaker Recognition

In this work we revisit discriminative training of the i-vector extracto...
research
06/20/2019

Unleashing the Unused Potential of I-Vectors Enabled by GPU Acceleration

Speaker embeddings are continuous-value vector representations that allo...
research
11/05/2018

How to Improve Your Speaker Embeddings Extractor in Generic Toolkits

Recently, speaker embeddings extracted with deep neural networks became ...
research
04/05/2019

Factorization of Discriminatively Trained i-vector Extractor for Speaker Recognition

In this work, we continue in our research on i-vector extractor for spea...
research
04/26/2018

On deep speaker embeddings for text-independent speaker recognition

We investigate deep neural network performance in the textindependent sp...
research
11/08/2022

High-resolution embedding extractor for speaker diarisation

Speaker embedding extractors significantly influence the performance of ...
research
10/12/2015

VB calibration to improve the interface between phone recognizer and i-vector extractor

The EM training algorithm of the classical i-vector extractor is often i...

Please sign up or login with your details

Forgot password? Click here to reset