A study of semi-supervised speaker diarization system using gan mixture model

10/24/2019
by   Monisankha Pal, et al.
0

We propose a new speaker diarization system based on a recently introduced unsupervised clustering technique namely, generative adversarial network mixture model (GANMM). The proposed system uses x-vectors as front-end representation. Spectral embedding is used for dimensionality reduction followed by k-means initialization during GANMM pre-training. GANMM performs unsupervised speaker clustering by efficiently capturing complex data distributions. Experimental results on the AMI meeting corpus show that the proposed semi-supervised diarization system matches or exceeds the performance of competitive baselines. On an evaluation set containing fifty sessions with varying durations, the best achieved average diarization error rate (DER) is 17.11 and comparable to xvector baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2018

Adversarially Learned Mixture Model

The Adversarially Learned Mixture Model (AMM) is a generative model for ...
research
06/18/2022

Semi-supervised Time Domain Target Speaker Extraction with Attention

In this work, we propose Exformer, a time-domain architecture for target...
research
08/18/2018

Robust Speaker Clustering using Mixtures of von Mises-Fisher Distributions for Naturalistic Audio Streams

Speaker Diarization (i.e. determining who spoke and when?) for multi-spe...
research
02/14/2022

Tight integration of neural- and clustering-based diarization through deep unfolding of infinite Gaussian mixture model

Speaker diarization has been investigated extensively as an important ce...
research
04/18/2022

Robust End-to-end Speaker Diarization with Generic Neural Clustering

End-to-end speaker diarization approaches have shown exceptional perform...
research
10/24/2019

Speaker diarization using latent space clustering in generative adversarial network

In this work, we propose deep latent space clustering for speaker diariz...
research
07/19/2020

Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

The performance of most speaker diarization systems with x-vector embedd...

Please sign up or login with your details

Forgot password? Click here to reset