Magnitude-aware Probabilistic Speaker Embeddings

02/28/2022
by   Nikita Kuzmin, et al.
0

Recently, hyperspherical embeddings have established themselves as a dominant technique for face and voice recognition. Specifically, Euclidean space vector embeddings are learned to encode person-specific information in their direction while ignoring the magnitude. However, recent studies have shown that the magnitudes of the embeddings extracted by deep neural networks may indicate the quality of the corresponding inputs. This paper explores the properties of the magnitudes of the embeddings related to quality assessment and out-of-distribution detection. We propose a new probabilistic speaker embedding extractor using the information encoded in the embedding magnitude and leverage it in the speaker verification pipeline. We also propose several quality-aware diarization methods and incorporate the magnitudes in those. Our results indicate significant improvements over magnitude-agnostic baselines both in speaker verification and diarization tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2019

Probing the Information Encoded in x-vectors

Deep neural network based speaker embeddings, such as x-vectors, have be...
research
10/28/2017

Speaker Diarization with LSTM

For many years, i-vector based speaker embedding techniques were the dom...
research
11/05/2018

How to Improve Your Speaker Embeddings Extractor in Generic Toolkits

Recently, speaker embeddings extracted with deep neural networks became ...
research
03/20/2020

Improving Embedding Extraction for Speaker Verification with Ladder Network

Speaker verification is an established yet challenging task in speech pr...
research
01/25/2023

Probing Taxonomic and Thematic Embeddings for Taxonomic Information

Modelling taxonomic and thematic relatedness is important for building A...
research
07/10/2017

Improving speaker turn embedding by crossmodal transfer learning from face embedding

Learning speaker turn embeddings has shown considerable improvement in s...
research
08/09/2017

Speaker Diarization using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings

In this paper we propose a new method of speaker diarization that employ...

Please sign up or login with your details

Forgot password? Click here to reset