Robust Speaker Clustering using Mixtures of von Mises-Fisher Distributions for Naturalistic Audio Streams

08/18/2018
by   Harishchandra Dubey, et al.
0

Speaker Diarization (i.e. determining who spoke and when?) for multi-speaker naturalistic interactions such as Peer-Led Team Learning (PLTL) sessions is a challenging task. In this study, we propose robust speaker clustering based on mixture of multivariate von Mises-Fisher distributions. Our diarization pipeline has two stages: (i) ground-truth segmentation; (ii) proposed speaker clustering. The ground-truth speech activity information is used for extracting i-Vectors from each speechsegment. We post-process the i-Vectors with principal component analysis for dimension reduction followed by lengthnormalization. Normalized i-Vectors are high-dimensional unit vectors possessing discriminative directional characteristics. We model the normalized i-Vectors with a mixture model consisting of multivariate von Mises-Fisher distributions. K-means clustering with cosine distance is chosen as baseline approach. The evaluation data is derived from: (i) CRSS-PLTL corpus; and (ii) three-meetings subset of AMI corpus. The CRSSPLTL data contain audio recordings of PLTL sessions which is student-led STEM education paradigm. Proposed approach is consistently better than baseline leading to upto 44.48 improvements for PLTL and AMI corpus, respectively. Index Terms: Speaker clustering, von Mises-Fisher distribution, Peer-led team learning, i-Vector, Naturalistic Audio.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2019

Toeplitz Inverse Covariance based Robust Speaker Clustering for Naturalistic Audio Streams

Speaker diarization determines who spoke and when? in an audio stream. I...
research
10/24/2019

A study of semi-supervised speaker diarization system using gan mixture model

We propose a new speaker diarization system based on a recently introduc...
research
09/10/2020

Speaker Diarization Using Stereo Audio Channels: Preliminary Study on Utterance Clustering

Speaker diarization is one of the actively researched topics in audio si...
research
06/12/2013

Robust Support Vector Machines for Speaker Verification Task

An important step in speaker verification is extracting features that be...
research
09/29/2017

PLDA-Based Diarization of Telephone Conversations

This paper investigates the application of the probabilistic linear disc...
research
08/02/2018

Statistical Speech Model Description with VMF Mixture Model

In this paper, we present the LSF parameters by a unit vector form, whic...
research
12/16/2015

A Novel Minimum Divergence Approach to Robust Speaker Identification

In this work, a novel solution to the speaker identification problem is ...

Please sign up or login with your details

Forgot password? Click here to reset