Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning

12/13/2020
by   Wei Xia, et al.
0

In this study, we investigate self-supervised representation learning for speaker verification (SV). First, we examine a simple contrastive learning approach (SimCLR) with a momentum contrastive (MoCo) learning framework, where the MoCo speaker embedding system utilizes a queue to maintain a large set of negative examples. We show that better speaker embeddings can be learned by momentum contrastive learning. Next, alternative augmentation strategies are explored to normalize extrinsic speaker variabilities of two random segments from the same speech utterance. Specifically, augmentation in the waveform largely improves the speaker representations for SV tasks. The proposed MoCo speaker embedding is further improved when a prototypical memory bank is introduced, which encourages the speaker embeddings to be closer to their assigned prototypes with an intermediate clustering step. In addition, we generalize the self-supervised framework to a semi-supervised scenario where only a small portion of the data is labeled. Comprehensive experiments on the Voxceleb dataset demonstrate that our proposed self-supervised approach achieves competitive performance compared with existing techniques, and can approach fully supervised results with partially labeled data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2022

A comprehensive study on self-supervised distillation for speaker representation learning

In real application scenarios, it is often challenging to obtain a large...
research
07/12/2022

Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning

State-of-the-art speaker verification systems are inherently dependent o...
research
08/15/2022

C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification

Self-supervised learning (SSL) has drawn an increased attention in the f...
research
06/10/2022

Federated Momentum Contrastive Clustering

We present federated momentum contrastive clustering (FedMCC), a learnin...
research
06/11/2021

A comprehensive solution to retrieval-based chatbot construction

In this paper we present the results of our experiments in training and ...
research
10/22/2020

Momentum Contrast Speaker Representation Learning

Unsupervised representation learning has shown remarkable achievement by...
research
07/23/2020

Augmentation adversarial training for unsupervised speaker recognition

The goal of this work is to train robust speaker recognition models with...

Please sign up or login with your details

Forgot password? Click here to reset