Exploring Binary Classification Loss For Speaker Verification

07/17/2023
by   Bing Han, et al.
0

The mismatch between close-set training and open-set testing usually leads to significant performance degradation for speaker verification task. For existing loss functions, metric learning-based objectives depend strongly on searching effective pairs which might hinder further improvements. And popular multi-classification methods are usually observed with degradation when evaluated on unseen speakers. In this work, we introduce SphereFace2 framework which uses several binary classifiers to train the speaker model in a pair-wise manner instead of performing multi-classification. Benefiting from this learning paradigm, it can efficiently alleviate the gap between training and evaluation. Experiments conducted on Voxceleb show that the SphereFace2 outperforms other existing loss functions, especially on hard trials. Besides, large margin fine-tuning strategy is proven to be compatible with it for further improvements. Finally, SphereFace2 also shows its strong robustness to class-wise noisy labels which has the potential to be applied in the semi-supervised training scenario with inaccurate estimated pseudo labels. Codes are available in https://github.com/Hunterhuan/sphereface2_speaker_verification

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2020

A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification

Despite the growing popularity of metric learning approaches, very littl...
research
03/26/2020

In defence of metric learning for speaker recognition

The objective of this paper is 'open-set' speaker recognition of unseen ...
research
07/17/2020

Deep multi-metric learning for text-independent speaker verification

Text-independent speaker verification is an important artificial intelli...
research
10/24/2019

Delving into VoxCeleb: environment invariant speaker recognition

Research in speaker recognition has recently seen significant progress d...
research
02/07/2019

End-to-end losses based on speaker basis vectors and all-speaker hard negative mining for speaker verification

In recent years, speaker verification has been primarily performed using...
research
12/04/2022

Label Encoding for Regression Networks

Deep neural networks are used for a wide range of regression problems. H...

Please sign up or login with your details

Forgot password? Click here to reset