Improving End-to-End Neural Diarization Using Conversational Summary Representations

06/24/2023
by   Samuel J. Broughton, et al.
0

Speaker diarization is a task concerned with partitioning an audio recording by speaker identity. End-to-end neural diarization with encoder-decoder based attractor calculation (EEND-EDA) aims to solve this problem by directly outputting diarization results for a flexible number of speakers. Currently, the EDA module responsible for generating speaker-wise attractors is conditioned on zero vectors providing no relevant information to the network. In this work, we extend EEND-EDA by replacing the input zero vectors to the decoder with learned conversational summary representations. The updated EDA module sequentially generates speaker-wise attractors based on utterance-level information. We propose three methods to initialize the summary vector and conduct an investigation into varying input recording lengths. On a range of publicly available test sets, our model achieves an absolute DER performance improvement of 1.90

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2021

Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization

This paper investigates an end-to-end neural diarization (EEND) method f...
research
05/18/2023

Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor

This paper proposes a novel Attention-based Encoder-Decoder network for ...
research
09/13/2023

Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding Enhancer

Deep neural network-based systems have significantly improved the perfor...
research
05/19/2021

Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech

Recently, we proposed a novel speaker diarization method called End-to-E...
research
12/14/2021

End-to-end speaker diarization with transformer

Speaker diarization is connected to semantic segmentation in computer vi...
research
04/24/2022

Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization

This paper investigates a method for simulating natural conversation in ...
research
07/19/2019

Modified zero forcing decoder for ill-conditioned channels

A modified zero-forcing (MZF) decoder for ill-conditioned Multi-Input Mu...

Please sign up or login with your details

Forgot password? Click here to reset