A Speaker Verification Backend with Robust Performance across Conditions

02/02/2021
by   Luciana Ferrer, et al.
0

In this paper, we address the problem of speaker verification in conditions unseen or unknown during development. A standard method for speaker verification consists of extracting speaker embeddings with a deep neural network and processing them through a backend composed of probabilistic linear discriminant analysis (PLDA) and global logistic regression score calibration. This method is known to result in systems that work poorly on conditions different from those used to train the calibration model. We propose to modify the standard backend, introducing an adaptive calibrator that uses duration and other automatically extracted side-information to adapt to the conditions of the inputs. The backend is trained discriminatively to optimize binary cross-entropy. When trained on a number of diverse datasets that are labeled only with respect to speaker, the proposed backend consistently and, in some cases, dramatically improves calibration, compared to the standard PLDA approach, on a number of held-out datasets, some of which are markedly different from the training data. Discrimination performance is also consistently improved. We show that joint training of the PLDA and the adaptive calibrator is essential – the same benefits cannot be achieved when freezing PLDA and fine-tuning the calibrator. To our knowledge, the results in this paper are the first evidence in the literature that it is possible to develop a speaker verification system with robust out-of-the-box performance on a large variety of conditions.

READ FULL TEXT

page 32

page 40

page 41

research
11/26/2019

A discriminative condition-aware backend for speaker verification

We present a scoring approach for speaker verification that mimics the s...
research
03/28/2022

Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems

Deep speaker embedding extractors have already become new state-of-the-a...
research
02/05/2020

A Speaker Verification Backend for Improved Calibration Performance across Varying Conditions

In a recent work, we presented a discriminative backend for speaker veri...
research
10/21/2020

The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification

In this paper we propose and analyse a large margin fine-tuning strategy...
research
10/18/2021

Tackling the Score Shift in Cross-Lingual Speaker Verification by Exploiting Language Information

This paper contains a post-challenge performance analysis on cross-lingu...
research
11/03/2021

STC speaker recognition systems for the NIST SRE 2021

This paper presents a description of STC Ltd. systems submitted to the N...
research
09/09/2021

The IDLAB VoxCeleb Speaker Recognition Challenge 2021 System Description

This technical report describes the IDLab submission for track 1 and 2 o...

Please sign up or login with your details

Forgot password? Click here to reset