NPU Speaker Verification System for INTERSPEECH 2020 Far-Field Speaker Verification Challenge

by   Li Zhang, et al.

This paper describes the NPU system submitted to Interspeech 2020 Far-Field Speaker Verification Challenge (FFSVC). We particularly focus on far-field text-dependent SV from single (task1) and multiple microphone arrays (task3). The major challenges in such scenarios are short utterance and cross-channel and distance mismatch for enrollment and test. With the belief that better speaker embedding can alleviate the effects from short utterance, we introduce a new speaker embedding architecture - ResNet-BAM, which integrates a bottleneck attention module with ResNet as a simple and efficient way to further improve the representation power of ResNet. This contribution brings up to 1 directions. First, domain adversarial training, which aims to learn domain-invariant features, can yield to 0.8 signal processing, including WPE and beamforming, has no obvious contribution, but together with data selection and domain adversarial training, can further contribute to 0.5 a specifically-designed data selection strategy, can lead to 2 Together with the above contributions, in the middle challenge results, our single submission system (without multi-system fusion) achieves the first and second place on task 1 and task 3, respectively.


page 1

page 2

page 3

page 4


The INTERSPEECH 2020 Far-Field Speaker Verification Challenge

The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 202...

HI-MIA : A Far-field Text-Dependent Speaker Verification Database and the Baselines

This paper presents a large far-field text-dependent speaker verificatio...

The FFSVC 2020 Evaluation Plan

The Far-Field Speaker Verification Challenge 2020 (FFSVC20) is designed ...

ResNeXt and Res2Net Structure for Speaker Verification

ResNet-based architecture has been widely adopted as the speaker embeddi...

UIAI System for Short-Duration Speaker Verification Challenge 2020

In this work, we present the system description of the UIAI entry for th...

Channel adversarial training for speaker verification and diarization

Previous work has encouraged domain-invariance in deep speaker embedding...

The NPU System for the 2020 Personalized Voice Trigger Challenge

This paper describes the system developed by the NPU team for the 2020 p...