Raw waveform speaker verification for supervised and self-supervised learning

03/16/2022
by   Jee-weon Jung, et al.
0

Speaker verification models that directly operate upon raw waveforms are receiving growing attention. However, their performances are less competitive than the state-of-the-art handcrafted feature-based counterparts, demonstrating equal error rates under 1 addition, they have yet not been explored with self-supervised learning frameworks. This paper proposes a new raw waveform speaker verification model that incorporates techniques proven effective for speaker verification, including the Res2Net backbone module and the aggregation method considering both context and channels. Under the best performing configuration, the model shows an equal error rate of 0.89 We also explore the proposed model with a self-supervised learning framework and show the state-of-the-art performance in this line of research. Finally, we show that leveraging the model trained with self-supervision successfully serves as a pre-trained model under the semi-supervised scenario where it is assumed that only a limited amount of data has a ground truth label and a bigger data has no label.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2023

Self-supervised learning with diffusion-based multichannel speech enhancement for speaker verification under noisy conditions

The paper introduces Diff-Filter, a multichannel speech enhancement appr...
research
08/05/2023

Self-Distillation Network with Ensemble Prototypes: Learning Robust Speaker Representations without Supervision

Training speaker-discriminative and robust speaker verification systems ...
research
05/03/2023

Improved Vocal Effort Transfer Vector Estimation for Vocal Effort-Robust Speaker Verification

Despite the maturity of modern speaker verification technology, its perf...
research
04/01/2020

Improved RawNet with Filter-wise Rescaling for Text-independent Speaker Verification using Raw Waveforms

Recent advances in deep learning have facilitated the design of speaker ...
research
04/01/2020

Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms

Recent advances in deep learning have facilitated the design of speaker ...
research
06/16/2023

Evaluation of Speech Representations for MOS prediction

In this paper, we evaluate feature extraction models for predicting spee...
research
10/08/2021

A study of the robustness of raw waveform based speaker embeddings under mismatched conditions

In this paper, we conduct a cross-dataset study on parametric and non-pa...

Please sign up or login with your details

Forgot password? Click here to reset