DNN Speaker Tracking with Embeddings

In multi-speaker applications is common to have pre-computed models from enrolled speakers. Using these models to identify the instances in which these speakers intervene in a recording is the task of speaker tracking. In this paper, we propose a novel embedding-based speaker tracking method. Specifically, our design is based on a convolutional neural network that mimics a typical speaker verification PLDA (probabilistic linear discriminant analysis) classifier and finds the regions uttered by the target speakers in an online fashion. The system was studied from two different perspectives: diarization and tracking; results on both show a significant improvement over the PLDA baseline under the same experimental conditions. Two standard public datasets, CALLHOME and DIHARD II single channel, were modified to create two-speaker subsets with overlapping and non-overlapping regions. We evaluate the robustness of our supervised approach with models generated from different segment lengths. A relative improvement of 17 channel shows promising performance. Furthermore, to make the baseline system similar to speaker tracking, non-target speakers were added to the recordings. Even in these adverse conditions, our approach is robust enough to outperform the PLDA baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2023

Margin-Mixup: A Method for Robust Speaker Verification in Multi-Speaker Audio

This paper is concerned with the task of speaker verification on audio w...
research
03/30/2021

Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech

Speaker verification has been studied mostly under the single-talker con...
research
03/26/2014

Constrained speaker linking

In this paper we study speaker linking (a.k.a. partitioning) given const...
research
12/01/2021

STEM: Unsupervised STructural EMbedding for Stance Detection

Stance detection is an important task, supporting many downstream tasks ...
research
09/23/2021

Joint speaker diarisation and tracking in switching state-space model

Speakers may move around while diarisation is being performed. When a mi...
research
09/28/2018

Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

This paper addresses the problem of online multiple-speaker localization...
research
03/30/2022

Generation of Speaker Representations Using Heterogeneous Training Batch Assembly

In traditional speaker diarization systems, a well-trained speaker model...

Please sign up or login with your details

Forgot password? Click here to reset