Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting

09/24/2022
by   Jie Wang, et al.
0

This paper describes a spatial-aware speaker diarization system for the multi-channel multi-party meeting. The diarization system obtains direction information of speaker by microphone array. Speaker spatial embedding is generated by xvector and s-vector derived from superdirective beamforming (SDB) which makes the embedding more robust. Specifically, we propose a novel multi-channel sequence-to-sequence neural network architecture named discriminative multi-stream neural network (DMSNet) which consists of attention superdirective beamforming (ASDB) block and Conformer encoder. The proposed ASDB is a self-adapted channel-wise block that extracts the latent spatial features of array audios by modeling interdependencies between channels. We explore DMSNet to address overlapped speech problem on multi-channel audio and achieve 93.53 overlapped speech detection (OSD) module, the diarization error rate (DER) of cluster-based diarization system decrease significantly from 13.45

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

The xmuspeech system for multi-channel multi-party meeting transcription challenge

This paper describes the system developed by the XMUSPEECH team for the ...
research
09/17/2023

Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture

We propose a novel neural speaker diarization system using memory-aware ...
research
02/09/2022

The Volcspeech system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge

This paper describes our submission to ICASSP 2022 Multi-channel Multi-p...
research
09/14/2023

M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec

We introduce M3-AUDIODEC, an innovative neural spatial audio codec desig...
research
03/24/2021

Blind Speech Separation and Dereverberation using Neural Beamforming

In this paper, we present the Blind Speech Separation and Dereverberatio...
research
10/11/2022

MFCCA:Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario

Recently cross-channel attention, which better leverages multi-channel s...
research
10/26/2022

Speaker Diarization Based on Multi-channel Microphone Array in Small-scale Meeting

In the task of speaker diarization, the number of small-scale meetings a...

Please sign up or login with your details

Forgot password? Click here to reset