Diarisation using location tracking with agglomerative clustering

09/22/2021
by   Jeremy H. M. Wong, et al.
0

Previous works have shown that spatial location information can be complementary to speaker embeddings for a speaker diarisation task. However, the models used often assume that speakers are fairly stationary throughout a meeting. This paper proposes to relax this assumption, by explicitly modelling the movements of speakers within an Agglomerative Hierarchical Clustering (AHC) diarisation framework. Kalman filters, which track the locations of speakers, are used to compute log-likelihood ratios that contribute to the cluster affinity computations for the AHC merging and stopping decisions. Experiments show that the proposed approach is able to yield improvements on a Microsoft rich meeting transcription task, compared to methods that do not use location information or that make stationarity assumptions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2021

Joint speaker diarisation and tracking in switching state-space model

Speakers may move around while diarisation is being performed. When a mi...
research
10/28/2017

Jointly Tracking and Separating Speech Sources Using Multiple Features and the generalized labeled multi-Bernoulli Framework

This paper proposes a novel joint multi-speaker tracking-and-separation ...
research
10/08/2021

Location-based training for multi-channel talker-independent speaker separation

Permutation-invariant training (PIT) is a dominant approach for addressi...
research
08/30/2019

Enhancements for Audio-only Diarization Systems

In this paper two different approaches to enhance the performance of the...
research
08/25/2018

Multiobjective Optimization Training of PLDA for Speaker Verification

Most current state-of-the-art text-independent speaker verification syst...
research
05/15/2020

Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification

Identifying multiple speakers without knowing where a speaker's voice is...
research
02/08/2016

The "Sprekend Nederland" project and its application to accent location

This paper describes the data collection effort that is part of the proj...

Please sign up or login with your details

Forgot password? Click here to reset