Contrastive-mixup learning for improved speaker verification

02/22/2022
by   Xin Zhang, et al.
0

This paper proposes a novel formulation of prototypical loss with mixup for speaker verification. Mixup is a simple yet efficient data augmentation technique that fabricates a weighted combination of random data point and label pairs for deep neural network training. Mixup has attracted increasing attention due to its ability to improve robustness and generalization of deep neural networks. Although mixup has shown success in diverse domains, most applications have centered around closed-set classification tasks. In this work, we propose contrastive-mixup, a novel augmentation strategy that learns distinguishing representations based on a distance metric. During training, mixup operations generate convex interpolations of both inputs and virtual labels. Moreover, we have reformulated the prototypical loss function such that mixup is enabled on metric learning objectives. To demonstrate its generalization given limited training data, we conduct experiments by varying the number of available utterances from each speaker in the VoxCeleb database. Experimental results show that applying contrastive-mixup outperforms the existing baseline, reducing error rate by 16 number of training utterances per speaker is limited.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2020

Deep multi-metric learning for text-independent speaker verification

Text-independent speaker verification is an important artificial intelli...
research
07/12/2022

Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning

State-of-the-art speaker verification systems are inherently dependent o...
research
08/08/2020

Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification

The effects of speaking-style variability on automatic speaker verificat...
research
09/21/2023

TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning

The goal of this work is Active Speaker Detection (ASD), a task to deter...
research
10/21/2020

Multi-task Metric Learning for Text-independent Speaker Verification

In this work, we introduce metric learning (ML) to enhance the deep embe...
research
04/05/2023

Adaptive Data Augmentation for Contrastive Learning

In computer vision, contrastive learning is the most advanced unsupervis...
research
12/27/2022

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Mixup is a popular data augmentation technique for training deep neural ...

Please sign up or login with your details

Forgot password? Click here to reset