Investigation of Data Augmentation Techniques for Disordered Speech Recognition

01/14/2022
by   Mengzhe Geng, et al.
0

Disordered speech recognition is a highly challenging task. The underlying neuro-motor conditions of people with speech disorders, often compounded with co-occurring physical disabilities, lead to the difficulty in collecting large quantities of speech required for system development. This paper investigates a set of data augmentation techniques for disordered speech recognition, including vocal tract length perturbation (VTLP), tempo perturbation and speed perturbation. Both normal and disordered speech were exploited in the augmentation process. Variability among impaired speakers in both the original and augmented data was modeled using learning hidden unit contributions (LHUC) based speaker adaptive training. The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92 (WER) reduction over the baseline system without data augmentation, and gave an overall WER of 26.37

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/02/2021

Adversarial Data Augmentation for Disordered Speech Recognition

Automatic recognition of disordered speech remains a highly challenging ...
research
06/23/2022

Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection

Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating p...
research
11/19/2021

A comparison of streaming models and data augmentation methods for robust speech recognition

In this paper, we present a comparative study on the robustness of two d...
research
11/03/2022

Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition

Automatic recognition of disordered speech remains a highly challenging ...
research
07/05/2023

Using Data Augmentations and VTLN to Reduce Bias in Dutch End-to-End Speech Recognition Systems

Speech technology has improved greatly for norm speakers, i.e., adult na...
research
07/11/2023

Improved POS tagging for spontaneous, clinical speech using data augmentation

This paper addresses the problem of improving POS tagging of transcripts...
research
08/12/2020

Mask Detection and Breath Monitoring from Speech: on Data Augmentation, Feature Representation and Modeling

This paper introduces our approaches for the Mask and Breathing Sub-Chal...

Please sign up or login with your details

Forgot password? Click here to reset