Royalflush Speaker Diarization System for ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge

02/10/2022
by   Jingguang Tian, et al.
0

This paper describes the Royalflush speaker diarization system submitted to the Multi-channel Multi-party Meeting Transcription Challenge(M2MeT). Our system comprises speech enhancement, overlapped speech detection, speaker embedding extraction, speaker clustering, speech separation and system fusion. In this system, we made three contributions. First, we propose an architecture of combining the multi-channel and U-Net-based models, aiming at utilizing the benefits of these two individual architectures, for far-field overlapped speech detection. Second, in order to use overlapped speech detection model to help speaker diarization, a speech separation based overlapped speech handling approach, in which the speaker verification technique is further applied, is proposed. Third, we explore three speaker embedding methods, and obtained the state-of-the-art performance on the CNCeleb-E test set. With these proposals, our best individual system significantly reduces DER from 15.25 the fusion of four systems finally achieves a DER of 6.30 Alimeeting evaluation set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2022

The BUCEA Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2022

This paper describes the BUCEA speaker diarization system for the 2022 V...
research
11/06/2019

The Speed Submission to DIHARD II: Contributions Lessons Learned

This paper describes the speaker diarization systems developed for the S...
research
02/15/2023

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

This paper describes our submission to the Second Clarity Enhancement Ch...
research
04/08/2019

Improved Speaker-Dependent Separation for CHiME-5 Challenge

This paper summarizes several follow-up contributions for improving our ...
research
11/03/2022

Dynamic Kernels and Channel Attention with Multi-Layer Embedding Aggregation for Speaker Verification

State-of-the-art speaker verification frameworks have typically focused ...
research
02/08/2021

Speaker and Direction Inferred Dual-channel Speech Separation

Most speech separation methods, trying to separate all channel sources s...
research
11/11/2021

MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

Motivated by unconsolidated data situation and the lack of a standard be...

Please sign up or login with your details

Forgot password? Click here to reset