Unified Hypersphere Embedding for Speaker Recognition

07/22/2018
by   Mahdi Hajibabaei, et al.
0

Incremental improvements in accuracy of Convolutional Neural Networks are usually achieved through use of deeper and more complex models trained on larger datasets. However, enlarging dataset and models increases the computation and storage costs and cannot be done indefinitely. In this work, we seek to improve the identification and verification accuracy of a text-independent speaker recognition system without use of extra data or deeper and more complex models by augmenting the training and testing data, finding the optimal dimensionality of embedding space and use of more discriminative loss functions. Results of experiments on VoxCeleb dataset suggest that: (i) Simple repetition and random time-reversion of utterances can reduce prediction errors by up to 18 verification. (iii) Use of proposed logistic margin loss function leads to unified embeddings with state-of-the-art identification and competitive verification accuracies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2019

Partial AUC optimization based deep speaker embeddings with class-center learning for text-independent speaker verification

Deep embedding based text-independent speaker verification has demonstra...
research
05/05/2017

Deep Speaker: an End-to-End Neural Speaker Embedding System

We present Deep Speaker, a neural speaker embedding system that maps utt...
research
07/17/2020

Deep multi-metric learning for text-independent speaker verification

Text-independent speaker verification is an important artificial intelli...
research
10/29/2020

The ins and outs of speaker recognition: lessons from VoxSRC 2020

The VoxCeleb Speaker Recognition Challenge (VoxSRC) at Interspeech 2020 ...
research
03/05/2021

Harnessing Geometric Constraints from Emotion Labels to improve Face Verification

For the task of face verification, we explore the utility of harnessing ...
research
02/25/2020

Speech2Phone: A Multilingual and Text Independent Speaker Identification Model

Voice recognition is an area with a wide application potential. Speaker ...
research
07/19/2017

Learning Unified Embedding for Apparel Recognition

In apparel recognition, specialized models (e.g. models trained for a pa...

Please sign up or login with your details

Forgot password? Click here to reset