End-to-End Residual CNN with L-GM Loss Speaker Verification System

05/02/2018
by   Xuan Shi, et al.
0

We propose an end-to-end speaker verification system based on the neural network and trained by a loss function with less computational complexity. The end-to-end speaker verification system consists of a ResNet architecture to extract features from utterance, then mean pool to produces utterance- level speaker embeddings, and train using the large-margin Gaussian Mixture loss function. Influenced by the large-margin and likelihood regularization, large-margin Gaussian Mixture loss function benefits the speaker verification performance. Experimental results demonstrate that the Residual CNN with large- margin Gaussian Mixture loss outperforms DNN-based i-vector baseline by nearly 10

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset