Tackling the Score Shift in Cross-Lingual Speaker Verification by Exploiting Language Information

10/18/2021
by   Jenthe Thienpondt, et al.
0

This paper contains a post-challenge performance analysis on cross-lingual speaker verification of the IDLab submission to the VoxCeleb Speaker Recognition Challenge 2021 (VoxSRC-21). We show that current speaker embedding extractors consistently underestimate speaker similarity in within-speaker cross-lingual trials. Consequently, the typical training and scoring protocols do not put enough emphasis on the compensation of intra-speaker language variability. We propose two techniques to increase cross-lingual speaker verification robustness. First, we enhance our previously proposed Large-Margin Fine-Tuning (LM-FT) training stage with a mini-batch sampling strategy which increases the amount of intra-speaker cross-lingual samples within the mini-batch. Second, we incorporate language information in the logistic regression calibration stage. We integrate quality metrics based on soft and hard decisions of a VoxLingua107 language identification model. The proposed techniques result in a 11.7 the VoxSRC-21 test set and contributed to our third place finish in the corresponding challenge.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2021

The IDLAB VoxCeleb Speaker Recognition Challenge 2021 System Description

This technical report describes the IDLab submission for track 1 and 2 o...
research
07/15/2020

Cross-Lingual Speaker Verification with Domain-Balanced Hard Prototype Mining and Language-Dependent Score Normalization

In this paper we describe the top-scoring IDLab submission for the text-...
research
06/22/2017

Cross-lingual Speaker Verification with Deep Feature Learning

Existing speaker verification (SV) systems often suffer from performance...
research
02/22/2022

Improving Cross-lingual Speech Synthesis with Triplet Training Scheme

Recent advances in cross-lingual text-to-speech (TTS) made it possible t...
research
06/25/2023

DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech

Although high-fidelity speech can be obtained for intralingual speech sy...
research
09/21/2022

The ReturnZero System for VoxCeleb Speaker Recognition Challenge 2022

In this paper, we describe the top-scoring submissions for team RTZR Vox...
research
02/02/2021

A Speaker Verification Backend with Robust Performance across Conditions

In this paper, we address the problem of speaker verification in conditi...

Please sign up or login with your details

Forgot password? Click here to reset