Unleashing the Unused Potential of I-Vectors Enabled by GPU Acceleration

06/20/2019
by   Ville Vestman, et al.
0

Speaker embeddings are continuous-value vector representations that allow easy comparison between voices of speakers with simple geometric operations. Among others, i-vector and x-vector have emerged as the mainstream methods for speaker embedding. In this paper, we illustrate the use of modern computation platform to harness the benefit of GPU acceleration for i-vector extraction. In particular, we achieve an acceleration of 3000 times in frame posterior computation compared to real time and 25 times in training the i-vector extractor compared to the CPU baseline from Kaldi toolkit. This significant speed-up allows the exploration of ideas that were hitherto impossible. In particular, we show that it is beneficial to update the universal background model (UBM) and re-compute frame alignments while training the i-vector extractor. Additionally, we are able to study different variations of i-vector extractors more rigorously than before. In this process, we reveal some undocumented details of Kaldi's i-vector extractor and show that it outperforms the standard formulation by a margin of 1 to 2 speaker verification protocol. All of our findings are asserted by ensemble averaging the results from multiple runs with random start.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2020

Neural i-vectors

Deep speaker embeddings have been demonstrated to outperform their gener...
research
10/31/2018

Discriminatively Re-trained i-vector Extractor for Speaker Recognition

In this work we revisit discriminative training of the i-vector extracto...
research
04/05/2019

Factorization of Discriminatively Trained i-vector Extractor for Speaker Recognition

In this work, we continue in our research on i-vector extractor for spea...
research
11/08/2022

High-resolution embedding extractor for speaker diarisation

Speaker embedding extractors significantly influence the performance of ...
research
12/29/2020

Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks

The recently proposed VBx diarization method uses a Bayesian hidden Mark...
research
03/30/2022

Multi-target Filter and Detector for Unknown-number Speaker Diarization

A strong representation of a target speaker can aid in extracting import...
research
11/20/2015

Variational Bayes Factor Analysis for i-Vector Extraction

In this document we are going to derive the equations needed to implemen...

Please sign up or login with your details

Forgot password? Click here to reset