Supervector Compression Strategies to Speed up I-Vector System Development

05/03/2018
by   Ville Vestman, et al.
0

The front-end factor analysis (FEFA), an extension of principal component analysis (PPCA) tailored to be used with Gaussian mixture models (GMMs), is currently the prevalent approach to extract compact utterance-level features (i-vectors) for automatic speaker verification (ASV) systems. Little research has been conducted comparing FEFA to the conventional PPCA applied to maximum a posteriori (MAP) adapted GMM supervectors. We study several alternative methods, including PPCA, factor analysis (FA), and two supervised approaches, supervised PPCA (SPPCA) and the recently proposed probabilistic partial least squares (PPLS), to compress MAP-adapted GMM supervectors. The resulting i-vectors are used in ASV tasks with a probabilistic linear discriminant analysis (PLDA) back-end. We experiment on two different datasets, on the telephone condition of NIST SRE 2010 and on the recent VoxCeleb corpus collected from YouTube videos containing celebrity interviews recorded in various acoustical and technical conditions. The results suggest that, in terms of ASV accuracy, the supervector compression approaches are on a par with FEFA. The supervised approaches did not result in improved performance. In comparison to FEFA, we obtained more than hundred-fold (100x) speedups in the total variability model (TVM) training using the PPCA and FA supervector compression approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2020

On Bottleneck Features for Text-Dependent Speaker Verification Using X-vectors

Applying x-vectors for speaker verification has recently attracted great...
research
02/12/2018

Linear Regression for Speaker Verification

This paper presents a linear regression based back-end for speaker verif...
research
04/17/2019

RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification

Recently, direct modeling of raw waveforms using deep neural networks ha...
research
11/04/2014

Tied Probabilistic Linear Discriminant Analysis for Speech Recognition

Acoustic models using probabilistic linear discriminant analysis (PLDA) ...
research
05/02/2018

End-to-End Residual CNN with L-GM Loss Speaker Verification System

We propose an end-to-end speaker verification system based on the neural...
research
09/29/2017

PLDA-Based Diarization of Telephone Conversations

This paper investigates the application of the probabilistic linear disc...
research
01/21/2023

Compact Optimization Learning for AC Optimal Power Flow

This paper reconsiders end-to-end learning approaches to the Optimal Pow...

Please sign up or login with your details

Forgot password? Click here to reset