Comparison of Multiple Features and Modeling Methods for Text-dependent Speaker Verification

07/14/2017
by   YI LIU, et al.
0

Text-dependent speaker verification is becoming popular in the speaker recognition society. However, the conventional i-vector framework which has been successful for speaker identification and other similar tasks works relatively poorly in this task. Researchers have proposed several new methods to improve performance, but it is still unclear that which model is the best choice, especially when the pass-phrases are prompted during enrollment and test. In this paper, we introduce four modeling methods and compare their performance on the newly published RedDots dataset. To further explore the influence of different frame alignments, Viterbi and forward-backward algorithms are both used in the HMM-based models. Several bottleneck features are also investigated. Our experiments show that, by explicitly modeling the lexical content, the HMM-based modeling achieves good results in the fixed-phrase condition. In the prompted-phrase condition, GMM-HMM and i-vector/HMM are not as successful. In both conditions, the forward-backward algorithm brings more benefits to the i-vector/HMM system. Additionally, we also find that even though bottleneck features perform well for text-independent speaker verification, they do not outperform MFCCs on the most challenging Imposter-Correct trials on RedDots.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2018

Spoken Pass-Phrase Verification in the i-vector Space

The task of spoken pass-phrase verification is to decide whether a test ...
research
05/24/2015

Deep Speaker Vectors for Semi Text-independent Speaker Verification

Recent research shows that deep neural networks (DNNs) can be used to ex...
research
11/19/2016

Incorporating Pass-Phrase Dependent Background Models for Text-Dependent Speaker Verification

In this paper, we propose pass-phrase dependent background models (PBMs)...
research
10/22/2020

Robust Text-Dependent Speaker Verification via Character-Level Information Preservation for the SdSV Challenge 2020

This paper describes our submission to Task 1 of the Short-duration Spea...
research
07/01/2017

Modeling and Analyzing the Vocal Tract under Normal and Stressful Talking Conditions

In this research, we model and analyze the vocal tract under normal and ...
research
05/11/2019

Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification

There are a number of studies about extraction of bottleneck (BN) featur...
research
12/22/2018

Differentiable Supervector Extraction for Encoding Speaker and Phrase Information in Text Dependent Speaker Verification

In this paper, we propose a new differentiable neural network alignment ...

Please sign up or login with your details

Forgot password? Click here to reset