Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization

12/08/2021
by   Mufan Sang, et al.
0

Training speaker-discriminative and robust speaker verification systems without speaker labels is still challenging and worthwhile to explore. In this study, we propose an effective self-supervised learning framework and a novel regularization strategy to facilitate self-supervised speaker representation learning. Different from contrastive learning-based self-supervised learning methods, the proposed self-supervised regularization (SSReg) focuses exclusively on the similarity between the latent representations of positive data pairs. We also explore the effectiveness of alternative online data augmentation strategies on both the time domain and frequency domain. With our strong online data augmentation strategy, the proposed SSReg shows the potential of self-supervised learning without using negative pairs and it can significantly improve the performance of self-supervised speaker representation learning with a simple Siamese network architecture. Comprehensive experiments on the VoxCeleb datasets demonstrate that our proposed self-supervised approach obtains a 23.4 regularization and outperforms other previous works.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2022

A comprehensive study on self-supervised distillation for speaker representation learning

In real application scenarios, it is often challenging to obtain a large...
research
10/27/2022

Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs

We study a novel neural architecture and its training strategies of spea...
research
05/24/2022

Multi-Augmentation for Efficient Visual Representation Learning for Self-supervised Pre-training

In recent years, self-supervised learning has been studied to deal with ...
research
09/06/2023

ViewMix: Augmentation for Robust Representation in Self-Supervised Learning

Joint Embedding Architecture-based self-supervised learning methods have...
research
03/08/2022

CaSS: A Channel-aware Self-supervised Representation Learning Framework for Multivariate Time Series Classification

Self-supervised representation learning of Multivariate Time Series (MTS...
research
08/19/2022

Self-Supervised Visual Place Recognition by Mining Temporal and Feature Neighborhoods

Visual place recognition (VPR) using deep networks has achieved state-of...
research
12/17/2021

Watermarking Images in Self-Supervised Latent Spaces

We revisit watermarking techniques based on pre-trained deep networks, i...

Please sign up or login with your details

Forgot password? Click here to reset