Fine-tuning wav2vec2 for speaker recognition

09/30/2021
by   Nik Vaessen, et al.
0

This paper explores applying the wav2vec2 framework to speaker recognition instead of speech recognition. We study the effectiveness of the pre-trained weights on the speaker recognition task, and how to pool the wav2vec2 output sequence into a fixed-length speaker embedding. To adapt the framework to speaker recognition, we propose a single-utterance classification variant with CE or AAM softmax loss, and an utterance-pair classification variant with BCE loss. Our best performing variant, w2v2-aam, achieves a 1.88 extended voxceleb1 test set compared to 1.69 Code is available at https://github.com/nikvaessen/w2v2-speaker.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/26/2019

Utterance-level Aggregation For Speaker Recognition In The Wild

The objective of this paper is speaker recognition "in the wild"-where u...
02/03/2020

Within-sample variability-invariant loss for robust speaker recognition under noisy environments

Despite the significant improvements in speaker recognition enabled by d...
11/02/2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games

We study a referential game (a type of signaling game) where two agents ...
10/26/2019

Sum-Product Networks for Robust Automatic Speaker Recognition

The performance of a speaker recognition system degrades considerably in...
05/25/2021

Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework

The performance of speaker recognition system is highly dependent on the...
11/25/2020

SAR-Net: A End-to-End Deep Speech Accent Recognition Network

This paper proposes a end-to-end deep network to recognize kinds of acce...
06/05/2018

LSTM Benchmarks for Deep Learning Frameworks

This study provides benchmarks for different implementations of LSTM uni...

Code Repositories

w2v2-speaker

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.