MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

11/11/2021
by   Ladislav Mošner, et al.
0

Motivated by unconsolidated data situation and the lack of a standard benchmark in the field, we complement our previous efforts and present a comprehensive corpus designed for training and evaluating text-independent multi-channel speaker verification systems. It can be readily used also for experiments with dereverberation, denoising, and speech enhancement. We tackled the ever-present problem of the lack of multi-channel training data by utilizing data simulation on top of clean parts of the Voxceleb dataset. The development and evaluation trials are based on a retransmitted Voices Obscured in Complex Environmental Settings (VOiCES) corpus, which we modified to provide multi-channel trials. We publish full recipes that create the dataset from public sources as the MultiSV corpus, and we provide results with two of our multi-channel speaker verification systems with neural network-based beamforming based either on predicting ideal binary masks or the more recent Conv-TasNet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/08/2019

A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database

DeepMine is a speech database in Persian and English designed to build a...
research
12/03/2019

HI-MIA : A Far-field Text-Dependent Speaker Verification Database and the Baselines

This paper presents a large far-field text-dependent speaker verificatio...
research
02/10/2022

Royalflush Speaker Diarization System for ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge

This paper describes the Royalflush speaker diarization system submitted...
research
10/17/2022

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Speaker verification (SV) suffers from unsatisfactory performance in far...
research
10/14/2021

M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

Recent development of speech signal processing, such as speech recogniti...
research
09/14/2023

M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec

We introduce M3-AUDIODEC, an innovative neural spatial audio codec desig...
research
10/25/2019

Channel adversarial training for speaker verification and diarization

Previous work has encouraged domain-invariance in deep speaker embedding...

Please sign up or login with your details

Forgot password? Click here to reset