DiaCorrect: End-to-end error correction for speaker diarization

10/31/2022
by   Jiangyu Han, et al.
0

In recent years, speaker diarization has attracted widespread attention. To achieve better performance, some studies propose to diarize speech in multiple stages. Although these methods might bring additional benefits, most of them are quite complex. Motivated by spelling correction in automatic speech recognition (ASR), in this paper, we propose an end-to-end error correction framework, termed DiaCorrect, to refine the initial diarization results in a simple but efficient way. By exploiting the acoustic interactions between input mixture and its corresponding speaker activity, DiaCorrect could automatically adapt the initial speaker activity to minimize the diarization errors. Without bells and whistles, experiments on LibriSpeech based 2-speaker meeting-like data show that, the self-attentitive end-to-end neural diarization (SA-EEND) baseline with DiaCorrect could reduce its diarization error rate (DER) by over 62.4 https://github.com/jyhan03/diacorrect.

READ FULL TEXT
research
09/15/2023

DiaCorrect: Error Correction Back-end For Speaker Diarization

In this work, we propose an error correction framework, named DiaCorrect...
research
11/05/2018

End-to-End Monaural Multi-speaker ASR System without Pretraining

Recently, end-to-end models have become a popular approach as an alterna...
research
09/24/2022

Unsupervised domain adaptation for speech recognition with unsupervised error correction

The transcription quality of automatic speech recognition (ASR) systems ...
research
08/09/2022

ASR Error Correction with Constrained Decoding on Operation Prediction

Error correction techniques remain effective to refine outputs from auto...
research
03/13/2020

ASR Error Correction and Domain Adaptation Using Machine Translation

Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are...
research
07/07/2016

Single-Channel Multi-Speaker Separation using Deep Clustering

Deep clustering is a recently introduced deep learning architecture that...
research
03/02/2023

Improving Transformer-based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention Heads

Transformer-based end-to-end neural speaker diarization (EEND) models ut...

Please sign up or login with your details

Forgot password? Click here to reset