Siamese Neural Network with Joint Bayesian Model Structure for Speaker Verification

04/07/2021
by   Xugang Lu, et al.
0

Generative probability models are widely used for speaker verification (SV). However, the generative models are lack of discriminative feature selection ability. As a hypothesis test, the SV can be regarded as a binary classification task which can be designed as a Siamese neural network (SiamNN) with discriminative training. However, in most of the discriminative training for SiamNN, only the distribution of pair-wised sample distances is considered, and the additional discriminative information in joint distribution of samples is ignored. In this paper, we propose a novel SiamNN with consideration of the joint distribution of samples. The joint distribution of samples is first formulated based on a joint Bayesian (JB) based generative model, then a SiamNN is designed with dense layers to approximate the factorized affine transforms as used in the JB model. By initializing the SiamNN with the learned model parameters of the JB model, we further train the model parameters with the pair-wised samples as a binary discrimination task for SV. We carried out SV experiments on data corpus of speakers in the wild (SITW) and VoxCeleb. Experimental results showed that our proposed model improved the performance with a large margin compared with state of the art models for SV.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2021

Integrating a joint Bayesian generative model in a discriminative learning framework for speaker verification

The task for speaker verification (SV) is to decide an utterance is spok...
research
02/10/2020

NPLDA: A Deep Neural PLDA Model for Speaker Verification

The state-of-art approach for speaker verification consists of a neural ...
research
01/20/2020

Pairwise Discriminative Neural PLDA for Speaker Verification

The state-of-art approach to speaker verification involves the extractio...
research
03/14/2018

Speaker Verification using Convolutional Neural Networks

In this paper, a novel Convolutional Neural Network architecture has bee...
research
10/31/2017

Full-info Training for Deep Speaker Feature Learning

In recent studies, it has shown that speaker patterns can be learned fro...
research
11/17/2017

A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification

J-vector has been proved to be very effective in text-dependent speaker ...
research
11/16/2017

An Iterative Closest Points Approach to Neural Generative Models

We present a simple way to learn a transformation that maps samples of o...

Please sign up or login with your details

Forgot password? Click here to reset