Probabilistic Back-ends for Online Speaker Recognition and Clustering

02/19/2023
by   Alexey Sholokhov, et al.
0

This paper focuses on multi-enrollment speaker recognition which naturally occurs in the task of online speaker clustering, and studies the properties of different scoring back-ends in this scenario. First, we show that popular cosine scoring suffers from poor score calibration with a varying number of enrollment utterances. Second, we propose a simple replacement for cosine scoring based on an extremely constrained version of probabilistic linear discriminant analysis (PLDA). The proposed model improves over the cosine scoring for multi-enrollment recognition while keeping the same performance in the case of one-to-one comparisons. Finally, we consider an online speaker clustering task where each step naturally involves multi-enrollment recognition. We propose an online clustering algorithm allowing us to take benefits from the PLDA model such as the ability to handle uncertainty and better score calibration. Our experiments demonstrate the effectiveness of the proposed algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2022

Toroidal Probabilistic Spherical Discriminant Analysis

In speaker recognition, where speech segments are mapped to embeddings o...
research
03/28/2022

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

In speaker recognition, where speech segments are mapped to embeddings o...
research
04/22/2022

Unifying Cosine and PLDA Back-ends for Speaker Verification

State-of-art speaker verification (SV) systems use a back-end model to s...
research
09/05/2021

The Phonexia VoxCeleb Speaker Recognition Challenge 2021 System Description

We describe the Phonexia submission for the VoxCeleb Speaker Recognition...
research
04/04/2021

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances

A back-end model is a key element of modern speaker verification systems...
research
04/05/2021

Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition

Many neural network speaker recognition systems model each speaker using...
research
02/23/2023

Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification

Speech utterances recorded under differing conditions exhibit varying de...

Please sign up or login with your details

Forgot password? Click here to reset