X-vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection from Speech

07/07/2020
by   Laetitia Jeancolas, et al.
0

Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect an early stage of PD from voice analysis. X-vectors are embeddings extracted from a deep neural network, which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients - Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (including recently diagnosed PD subjects and healthy controls) with a high-quality microphone and with their own telephone. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of audio segment duration, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7 to 15 improvement). This result was observed for both recording types (high-quality microphone and telephone).

READ FULL TEXT
research
06/03/2021

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Multi-speaker spoken datasets enable the creation of text-to-speech synt...
research
09/13/2019

Probing the Information Encoded in x-vectors

Deep neural network based speaker embeddings, such as x-vectors, have be...
research
11/23/2021

Guided-TTS:Text-to-Speech with Untranscribed Speech

Most neural text-to-speech (TTS) models require <speech, transcript> pai...
research
03/02/2020

Pathological speech detection using x-vector embeddings

The potential of speech as a non-invasive biomarker to assess a speaker'...
research
10/31/2022

An analysis of degenerating speech due to progressive dysarthria on ASR performance

Although personalized automatic speech recognition (ASR) models have rec...
research
06/03/2021

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis

Building multispeaker neural network-based text-to-speech synthesis syst...
research
05/17/2020

Voice Activity Detection Scheme by Combining DNN Model with GMM Model

Due to the superior modeling ability of deep neural network (DNN), it is...

Please sign up or login with your details

Forgot password? Click here to reset