Channel adversarial training for speaker verification and diarization

10/25/2019
by   Chau Luu, et al.
0

Previous work has encouraged domain-invariance in deep speaker embedding by adversarially classifying the dataset or labelled environment to which the generated features belong. We propose a training strategy which aims to produce features that are invariant at the granularity of the recording or channel, a finer grained objective than dataset- or environment-invariance. By training an adversary to predict whether pairs of same-speaker embeddings belong to the same recording in a Siamese fashion, learned features are discouraged from utilizing channel information that may be speaker discriminative during training. Experiments for verification on VoxCeleb and diarization and verification on CALLHOME show promising improvements over a strong baseline in addition to outperforming a dataset-adversarial model. The VoxCeleb model in particular performs well, achieving a 4% relative improvement in EER over a Kaldi baseline, while using a similar architecture and less training data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2019

Delving into VoxCeleb: environment invariant speaker recognition

Research in speaker recognition has recently seen significant progress d...
research
11/07/2018

Adapting End-to-End Neural Speaker Verification to New Languages and Recording Conditions with Adversarial Training

In this article we propose a novel approach for adapting speaker embeddi...
research
11/03/2019

Robust speaker recognition using unsupervised adversarial invariance

In this paper, we address the problem of speaker recognition in challeng...
research
07/23/2020

Augmentation adversarial training for unsupervised speaker recognition

The goal of this work is to train robust speaker recognition models with...
research
11/07/2018

Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification

This article presents a novel approach for learning domain-invariant spe...
research
08/09/2020

Cosine-Distance Virtual Adversarial Training for Semi-Supervised Speaker-Discriminative Acoustic Embeddings

In this paper, we propose a semi-supervised learning (SSL) technique for...
research
11/11/2021

MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

Motivated by unconsolidated data situation and the lack of a standard be...

Please sign up or login with your details

Forgot password? Click here to reset