MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages

11/30/2022
by   Yue Li, et al.
0

This report describes the NPU-HC speaker verification system submitted to the O-COCOSDA Multi-lingual Speaker Verification (MSV) Challenge 2022, which focuses on developing speaker verification systems for low-resource Asian languages. We participate in the I-MSV track, which aims to develop speaker verification systems for various Indian languages. In this challenge, we first explore different neural network frameworks for low-resource speaker verification. Then we leverage vanilla fine-tuning and weight transfer fine-tuning to transfer the out-domain pre-trained models to the in-domain Indian dataset. Specifically, the weight transfer fine-tuning aims to constrain the distance of the weights between the pre-trained model and the fine-tuned model, which takes advantage of the previously acquired discriminative ability from the large-scale out-domain datasets and avoids catastrophic forgetting and overfitting at the same time. Finally, score fusion is adopted to further improve performance. Together with the above contributions, we obtain 0.223 EER on the public evaluation set, ranking 2nd place on the leaderboard. On the private evaluation set, the EER of our submitted system is 2.123 for the constrained and unconstrained sub-tasks of the I-MSV track, leading to the 1st and 3rd place in the ranking, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2022

Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters

Recently, the pre-trained Transformer models have received a rising inte...
research
08/03/2022

The SJTU System for Short-duration Speaker Verification Challenge 2021

This paper presents the SJTU system for both text-dependent and text-ind...
research
03/28/2023

One Adapter for All Programming Languages? Adapter Tuning for Code Search and Summarization

As pre-trained models automate many code intelligence tasks, a widely us...
research
04/29/2020

Avoiding catastrophic forgetting in mitigating model biases in sentence-pair classification with elastic weight consolidation

The biases present in training datasets have been shown to be affecting ...
research
11/12/2020

Using IPA-Based Tacotron for Data Efficient Cross-Lingual Speaker Adaptation and Pronunciation Enhancement

Recent neural Text-to-Speech (TTS) models have been shown to perform ver...
research
03/10/2022

EACELEB: An East Asian Language Speaking Celebrity Dataset for Speaker Recognition

Large datasets are very useful for training speaker recognition systems,...
research
09/21/2020

Open-set Short Utterance Forensic Speaker Verification using Teacher-Student Network with Explicit Inductive Bias

In forensic applications, it is very common that only small naturalistic...

Please sign up or login with your details

Forgot password? Click here to reset