Robust Speaker Recognition with Transformers Using wav2vec 2.0

03/28/2022
by   Sergey Novoselov, et al.
0

Recent advances in unsupervised speech representation learning discover new approaches and provide new state-of-the-art for diverse types of speech processing tasks. This paper presents an investigation of using wav2vec 2.0 deep speech representations for the speaker recognition task. The proposed fine-tuning procedure of wav2vec 2.0 with simple TDNN and statistic pooling back-end using additive angular margin loss allows to obtain deep speaker embedding extractor that is well-generalized across different domains. It is concluded that Contrastive Predictive Coding pretraining scheme efficiently utilizes the power of unlabeled data, and thus opens the door to powerful transformer-based speaker recognition systems. The experimental results obtained in this study demonstrate that fine-tuning can be done on relatively small sets and a clean version of data. Using data augmentation during fine-tuning provides additional performance gains in speaker verification. In this study speaker recognition systems were analyzed on a wide range of well-known verification protocols: VoxCeleb1 cleaned test set, NIST SRE 18 development set, NIST SRE 2016 and NIST SRE 2019 evaluation set, VOiCES evaluation set, NIST 2021 SRE, and CTS challenges sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2021

STC speaker recognition systems for the NIST SRE 2021

This paper presents a description of STC Ltd. systems submitted to the N...
research
06/20/2020

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

We show for the first time that learning powerful representations from s...
research
09/30/2021

Fine-tuning wav2vec2 for speaker recognition

This paper explores applying the wav2vec2 framework to speaker recogniti...
research
12/11/2020

Exploring wav2vec 2.0 on speaker verification and language identification

Wav2vec 2.0 is a recently proposed self-supervised framework for speech ...
research
04/21/2022

The NIST CTS Speaker Recognition Challenge

The US National Institute of Standards and Technology (NIST) has been co...
research
10/24/2016

UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation

This document briefly describes the systems submitted by the Center for ...
research
04/21/2022

The 2021 NIST Speaker Recognition Evaluation

The 2021 Speaker Recognition Evaluation (SRE21) was the latest cycle of ...

Please sign up or login with your details

Forgot password? Click here to reset