Curricular SincNet: Towards Robust Deep Speaker Recognition by Emphasizing Hard Samples in Latent Space

08/21/2021
by   Labib Chowdhury, et al.
0

Deep learning models have become an increasingly preferred option for biometric recognition systems, such as speaker recognition. SincNet, a deep neural network architecture, gained popularity in speaker recognition tasks due to its parameterized sinc functions that allow it to work directly on the speech signal. The original SincNet architecture uses the softmax loss, which may not be the most suitable choice for recognition-based tasks. Such loss functions do not impose inter-class margins nor differentiate between easy and hard training samples. Curriculum learning, particularly those leveraging angular margin-based losses, has proven very successful in other biometric applications such as face recognition. The advantage of such a curriculum learning-based techniques is that it will impose inter-class margins as well as taking to account easy and hard samples. In this paper, we propose Curricular SincNet (CL-SincNet), an improved SincNet model where we use a curricular loss function to train the SincNet architecture. The proposed model is evaluated on multiple datasets using intra-dataset and inter-dataset evaluation protocols. In both settings, the model performs competitively with other previously published work. In the case of inter-dataset testing, it achieves the best overall results with a reduction of 4% error rate compare to SincNet and other published work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2021

A Decidability-Based Loss Function

Nowadays, deep learning is the standard approach for a wide range of pro...
research
01/28/2019

Additive Margin SincNet for Speaker Recognition

Speaker Recognition is a challenging task with essential applications su...
research
04/01/2020

CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition

As an emerging topic in face recognition, designing margin-based loss fu...
research
08/12/2019

A Study on Angular Based Embedding Learning for Text-independent Speaker Verification

Learning a good speaker embedding is important for many automatic speake...
research
05/24/2022

SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition

Deep face recognition has achieved great success due to large-scale trai...
research
08/04/2020

Intra-class variation reduction of speaker representation in disentanglement framework

In this paper, we propose an effective training strategy to ex-tract rob...
research
11/12/2020

DSAM: A Distance Shrinking with Angular Marginalizing Loss for High Performance Vehicle Re-identificatio

Vehicle Re-identification (ReID) is an important yet challenging problem...

Please sign up or login with your details

Forgot password? Click here to reset