Significance of Data Augmentation for Improving Cleft Lip and Palate Speech Recognition

10/02/2021
by   Protima Nomo Sudro, et al.
0

The automatic recognition of pathological speech, particularly from children with any articulatory impairment, is a challenging task due to various reasons. The lack of available domain specific data is one such obstacle that hinders its usage for different speech-based applications targeting pathological speakers. In line with the challenge, in this work, we investigate a few data augmentation techniques to simulate training data for improving the children speech recognition considering the case of cleft lip and palate (CLP) speech. The augmentation techniques explored in this study, include vocal tract length perturbation (VTLP), reverberation, speaking rate, pitch modification, and speech feature modification using cycle consistent adversarial networks (CycleGAN). Our study finds that the data augmentation methods significantly improve the CLP speech recognition performance, which is more evident when we used feature modification using CycleGAN, VTLP and reverberation based methods. More specifically, the results from this study show that our systems produce an improved phone error rate compared to the systems without data augmentation.

READ FULL TEXT
research
11/09/2020

Data Augmentation For Children's Speech Recognition – The "Ethiopian" System For The SLT 2021 Children Speech Recognition Challenge

This paper presents the "Ethiopian" system for the SLT 2021 Children Spe...
research
07/11/2023

Improved POS tagging for spontaneous, clinical speech using data augmentation

This paper addresses the problem of improving POS tagging of transcripts...
research
02/18/2021

Fundamental Frequency Feature Normalization and Data Augmentation for Child Speech Recognition

Automatic speech recognition (ASR) systems for young children are needed...
research
11/12/2020

The CUHK-TUDELFT System for The SLT 2021 Children Speech Recognition Challenge

This technical report describes our submission to the 2021 SLT Children ...
research
03/31/2022

SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy

Deep learning based singing voice synthesis (SVS) systems have been demo...
research
10/08/2020

Population Based Training for Data Augmentation and Regularization in Speech Recognition

Varying data augmentation policies and regularization over the course of...
research
07/05/2023

Using Data Augmentations and VTLN to Reduce Bias in Dutch End-to-End Speech Recognition Systems

Speech technology has improved greatly for norm speakers, i.e., adult na...

Please sign up or login with your details

Forgot password? Click here to reset