TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization

03/08/2023
by   Jiaming Wang, et al.
0

Recently, end-to-end neural diarization (EEND) is introduced and achieves promising results in speaker-overlapped scenarios. In EEND, speaker diarization is formulated as a multi-label prediction problem, where speaker activities are estimated independently and their dependency are not well considered. To overcome these disadvantages, we employ the power set encoding to reformulate speaker diarization as a single-label classification problem and propose the overlap-aware EEND (EEND-OLA) model, in which speaker overlaps and dependency can be modeled explicitly. Inspired by the success of two-stage hybrid systems, we further propose a novel Two-stage OverLap-aware Diarization framework (TOLD) by involving a speaker overlap-aware post-processing (SOAP) model to iteratively refine the diarization results of EEND-OLA. Experimental results show that, compared with the original EEND, the proposed EEND-OLA achieves a 14.39 utilizing SOAP provides another 19.33 method TOLD achieves a DER of 10.14 state-of-the-art result on this benchmark to the best of our knowledge.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2022

Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis

Recently, hybrid systems of clustering and neural diarization models hav...
research
07/25/2022

Unsupervised Speaker Diarization that is Agnostic to Language, Overlap-Aware, and Tuning Free

Podcasts are conversational in nature and speaker changes are frequent –...
research
04/08/2021

End-to-end speaker segmentation for overlap-aware resegmentation

Speaker segmentation consists in partitioning a conversation between one...
research
03/18/2022

Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios

Overlapping speech diarization has been traditionally treated as a multi...
research
10/25/2019

Overlap-aware diarization: resegmentation using neural end-to-end overlapped speech detection

We address the problem of effectively handling overlapping speech in a d...
research
11/28/2021

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information

Overlapping speech diarization is always treated as a multi-label classi...
research
09/02/2019

Identifying Personality Traits Using Overlap Dynamics in Multiparty Dialogue

Research on human spoken language has shown that speech plays an importa...

Please sign up or login with your details

Forgot password? Click here to reset