Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition

01/24/2022
by   Xurong Xie, et al.
0

Dysarthric speech recognition is a challenging task due to acoustic variability and limited amount of available data. Diverse conditions of dysarthric speakers account for the acoustic variability, which make the variability difficult to be modeled precisely. This paper presents a variational auto-encoder based variability encoder (VAEVE) to explicitly encode such variability for dysarthric speech. The VAEVE makes use of both phoneme information and low-dimensional latent variable to reconstruct the input acoustic features, thereby the latent variable is forced to encode the phoneme-independent variability. Stochastic gradient variational Bayes algorithm is applied to model the distribution for generating variability encodings, which are further used as auxiliary features for DNN acoustic modeling. Experiment results conducted on the UASpeech corpus show that the VAEVE based variability encodings have complementary effect to the learning hidden unit contributions (LHUC) speaker adaptation. The systems using variability encodings consistently outperform the comparable baseline systems without using them, and" obtain absolute word error rate (WER) reduction by up to 2.2 on the "Mixed" type of dysarthric speech with diverse or uncertain conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2022

Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition

Articulatory features are inherently invariant to acoustic signal distor...
research
04/01/2022

Filter-based Discriminative Autoencoders for Children Speech Recognition

Children speech recognition is indispensable but challenging due to the ...
research
01/14/2022

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition

Automatic recognition of disordered speech remains a highly challenging ...
research
11/03/2022

Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition

Automatic recognition of disordered speech remains a highly challenging ...
research
01/12/2016

Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation

This work presents a broad study on the adaptation of neural network aco...
research
09/20/2017

Updating the silent speech challenge benchmark with deep learning

The 2010 Silent Speech Challenge benchmark is updated with new results o...
research
09/05/2019

Bandwidth Embeddings for Mixed-bandwidth Speech Recognition

In this paper, we tackle the problem of handling narrowband and wideband...

Please sign up or login with your details

Forgot password? Click here to reset