The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition

01/13/2022
by   Luke Prananta, et al.
0

In this paper, we investigate several existing and a new state-of-the-art generative adversarial network-based (GAN) voice conversion method for enhancing dysarthric speech for improved dysarthric speech recognition. We compare key components of existing methods as part of a rigorous ablation study to find the most effective solution to improve dysarthric speech recognition. We find that straightforward signal processing methods such as stationary noise removal and vocoder-based time stretching lead to dysarthric speech recognition results comparable to those obtained when using state-of-the-art GAN-based voice conversion methods as measured using a phoneme recognition task. Additionally, our proposed solution of a combination of MaskCycleGAN-VC and time stretched enhancement is able to improve the phoneme recognition results for certain dysarthric speakers compared to our time stretched baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2021

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

We present an unsupervised non-parallel many-to-many voice conversion (V...
research
01/10/2020

Improving Dysarthric Speech Intelligibility Using Cycle-consistent Adversarial Training

Dysarthria is a motor speech impairment affecting millions of people. Dy...
research
03/17/2022

Robust and Complex Approach of Pathological Speech Signal Analysis

This paper presents a study of the approaches in the state-of-the-art in...
research
11/05/2017

Robust Speech Recognition Using Generative Adversarial Networks

This paper describes a general, scalable, end-to-end framework that uses...
research
06/12/2023

Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation

In dysarthric speech recognition, data scarcity and the vast diversity b...
research
09/09/2022

Conversion of Acoustic Signal (Speech) Into Text By Digital Filter using Natural Language Processing

One of the most crucial aspects of communication in daily life is speech...
research
10/20/2022

Anchored Speech Recognition with Neural Transducers

Neural transducers have gained popularity in production ASR systems, ach...

Please sign up or login with your details

Forgot password? Click here to reset