Separation Guided Speaker Diarization in Realistic Mismatched Conditions

07/06/2021
by   Shu-Tong Niu, et al.
0

We propose a separation guided speaker diarization (SGSD) approach by fully utilizing a complementarity of speech separation and speaker clustering. Since the conventional clustering-based speaker diarization (CSD) approach cannot well handle overlapping speech segments, we investigate, in this study, separation-based speaker diarization (SSD) which inherently has the potential to handle the speaker overlap regions. Our preliminary analysis shows that the state-of-the-art Conv-TasNet based speech separation, which works quite well on the simulation data, is unstable in realistic conversational speech due to the high mismatch speaking styles in simulated training speech and read speech. In doing so, separation-based processing can assist CSD in handling the overlapping speech segments under the realistic mismatched conditions. Specifically, several strategies are designed to select between the results of SSD and CSD systems based on an analysis of the instability of the SSD system performances. Experiments on the conversational telephone speech (CTS) data from DIHARD-III Challenge show that the proposed SGSD system can significantly improve the performance of state-of-the-art CSD systems, yielding relative diarization error rate reductions of 20.2 evaluation set, respectively.

READ FULL TEXT
research
10/20/2020

Speaker Separation Using Speaker Inventories and Estimated Speech

We propose speaker separation using speaker inventories and estimated sp...
research
12/19/2019

Practical applicability of deep neural networks for overlapping speaker separation

This paper examines the applicability in realistic scenarios of two deep...
research
03/30/2022

Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

Because the performance of speech separation is excellent for speech in ...
research
05/29/2023

An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings

We performed an experimental review of current diarization systems for t...
research
12/02/2020

The Third DIHARD Diarization Challenge

This paper introduces the third DIHARD challenge, the third in a series ...
research
05/18/2023

Speech Separation based on Contrastive Learning and Deep Modularization

The current monaural state of the art tools for speech separation relies...
research
12/10/2021

Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech

Many of the recent advances in speech separation are primarily aimed at ...

Please sign up or login with your details

Forgot password? Click here to reset