Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition

01/27/2022
by   Mohammad Soleymanpour, et al.
0

Dysarthria is a motor speech disorder often characterized by reduced speech intelligibility through slow, uncoordinated control of speech production muscles. Automatic Speech recognition (ASR) systems may help dysarthric talkers communicate more effectively. To have robust dysarthria-specific ASR, sufficient training speech is required, which is not readily available. Recent advances in Text-To-Speech (TTS) synthesis multi-speaker end-to-end TTS systems suggest the possibility of using synthesis for data augmentation. In this paper, we aim to improve multi-speaker end-to-end TTS systems to synthesize dysarthric speech for improved training of a dysarthria-specific DNN-HMM ASR. In the synthesized speech, we add dysarthria severity level and pause insertion mechanisms to other control parameters such as pitch, energy, and duration. Results show that a DNN-HMM model trained on additional synthetic dysarthric speech achieves WER improvement of 12.2 of the severity level and pause insertion controls decrease WER by 6.5 showing the effectiveness of adding these parameters. Audio samples are available at

READ FULL TEXT

page 2

page 4

research
08/16/2023

Accurate synthesis of Dysarthric Speech for ASR data augmentation

Dysarthria is a motor speech disorder often characterized by reduced spe...
research
11/04/2022

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

Stuttering is a speech disorder where the natural flow of speech is inte...
research
11/24/2020

Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech

In recent years, Text-To-Speech (TTS) has been used as a data augmentati...
research
05/18/2023

Use of Speech Impairment Severity for Dysarthric Speech Recognition

A key challenge in dysarthric speech recognition is the speaker-level di...
research
11/04/2019

What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis

End-to-end speech recognition systems have achieved competitive results ...
research
07/13/2020

The Faults in our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems

Speech and speaker recognition systems are employed in a variety of appl...
research
12/18/2019

A Cycle-GAN Approach to Model Natural Perturbations in Speech for ASR Applications

Naturally introduced perturbations in audio signal, caused by emotional ...

Please sign up or login with your details

Forgot password? Click here to reset