A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification

03/31/2020
by   Juan M. Coria, et al.
0

Despite the growing popularity of metric learning approaches, very little work has attempted to perform a fair comparison of these techniques for speaker verification. We try to fill this gap and compare several metric learning loss functions in a systematic manner on the VoxCeleb dataset. The first family of loss functions is derived from the cross entropy loss (usually used for supervised classification) and includes the congenerous cosine loss, the additive angular margin loss, and the center loss. The second family of loss functions focuses on the similarity between training samples and includes the contrastive loss and the triplet loss. We show that the additive angular margin loss function outperforms all other loss functions in the study, while learning more robust representations. Based on a combination of SincNet trainable features and the x-vector architecture, the network used in this paper brings us a step closer to a really-end-to-end speaker verification system, when combined with the additive angular margin loss, while still being competitive with the x-vector baseline. In the spirit of reproducible research, we also release open source Python code for reproducing our results, and share pretrained PyTorch models on torch.hub that can be used either directly or after fine-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2019

End-to-end losses based on speaker basis vectors and all-speaker hard negative mining for speaker verification

In recent years, speaker verification has been primarily performed using...
research
04/21/2020

AMC-Loss: Angular Margin Contrastive Loss for Improved Explainability in Image Classification

Deep-learning architectures for classification problems involve the cros...
research
07/17/2023

Exploring Binary Classification Loss For Speaker Verification

The mismatch between close-set training and open-set testing usually lea...
research
07/17/2020

Deep multi-metric learning for text-independent speaker verification

Text-independent speaker verification is an important artificial intelli...
research
11/28/2022

Distance Metric Learning Loss Functions in Few-Shot Scenarios of Supervised Language Models Fine-Tuning

This paper presents an analysis regarding an influence of the Distance M...
research
10/04/2021

Incremental Class Learning using Variational Autoencoders with Similarity Learning

Catastrophic forgetting in neural networks during incremental learning r...
research
09/13/2020

Cosine meets Softmax: A tough-to-beat baseline for visual grounding

In this paper, we present a simple baseline for visual grounding for aut...

Please sign up or login with your details

Forgot password? Click here to reset