Probing the Information Encoded in x-vectors

09/13/2019
by   Desh Raj, et al.
0

Deep neural network based speaker embeddings, such as x-vectors, have been shown to perform well in text-independent speaker recognition/verification tasks. In this paper, we use simple classifiers to investigate the contents encoded by x-vector embeddings. We probe these embeddings for information related to the speaker, channel, transcription (sentence, words, phones), and meta information about the utterance (duration and augmentation type), and compare these with the information encoded by i-vectors across a varying number of dimensions. We also study the effect of data augmentation during extractor training on the information captured by x-vectors. Experiments on the RedDots data set show that x-vectors capture spoken content and channel-related information, while performing well on speaker verification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/11/2020

S-vectors: Speaker Embeddings based on Transformer's Encoder for Text-Independent Speaker Verification

X-vectors have become the standard for speaker-embeddings in automatic s...
research
02/28/2022

Magnitude-aware Probabilistic Speaker Embeddings

Recently, hyperspherical embeddings have established themselves as a dom...
research
03/09/2022

An Environmental Feature Representation in I-vector Space for Room Verification and Metadata Estimation

This paper investigates the application of environmental feature represe...
research
07/07/2020

X-vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection from Speech

Many articles have used voice analysis to detect Parkinson's disease (PD...
research
02/10/2020

An empirical analysis of information encoded in disentangled neural speaker representations

The primary characteristic of robust speaker representations is that the...
research
04/07/2022

Detecting Vocal Fatigue with Neural Embeddings

Vocal fatigue refers to the feeling of tiredness and weakness of voice d...
research
04/17/2019

RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification

Recently, direct modeling of raw waveforms using deep neural networks ha...

Please sign up or login with your details

Forgot password? Click here to reset