Why Did the x-Vector System Miss a Target Speaker? Impact of Acoustic Mismatch Upon Target Score on VoxCeleb Data

Modern automatic speaker verification (ASV) relies heavily on machine learning implemented through deep neural networks. It can be difficult to interpret the output of these black boxes. In line with interpretative machine learning, we model the dependency of ASV detection score upon acoustic mismatch of the enrollment and test utterances. We aim to identify mismatch factors that explain target speaker misses (false rejections). We use distance in the first- and second-order statistics of selected acoustic features as the predictors in a linear mixed effects model, while a standard Kaldi x-vector system forms our ASV black-box. Our results on the VoxCeleb data reveal the most prominent mismatch factor to be in F0 mean, followed by mismatches associated with formant frequencies. Our findings indicate that x-vector systems lack robustness to intra-speaker variations.

READ FULL TEXT
research
06/22/2017

Cross-lingual Speaker Verification with Deep Feature Learning

Existing speaker verification (SV) systems often suffer from performance...
research
03/27/2018

Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model

Speaker adaptation aims to estimate a speaker specific acoustic model fr...
research
04/08/2020

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification

Speaker verification systems usually suffer from the mismatch problem be...
research
04/06/2021

Optimal Transport-based Adaptation in Dysarthric Speech Tasks

In many real-world applications, the mismatch between distributions of t...
research
07/10/2018

Two-stage iterative Procrustes match algorithm and its application for VQ-based speaker verification

In the past decades, Vector Quantization (VQ) model has been very popula...
research
12/23/2020

A Principle Solution for Enroll-Test Mismatch in Speaker Recognition

Mismatch between enrollment and test conditions causes serious performan...
research
01/28/2022

Impact of Naturalistic Field Acoustic Environments on Forensic Text-independent Speaker Verification System

Audio analysis for forensic speaker verification offers unique challenge...

Please sign up or login with your details

Forgot password? Click here to reset