Using Data Augmentations and VTLN to Reduce Bias in Dutch End-to-End Speech Recognition Systems

07/05/2023
by   Tanvina Patel, et al.
0

Speech technology has improved greatly for norm speakers, i.e., adult native speakers of a language without speech impediments or strong accents. However, non-norm or diverse speaker groups show a distinct performance gap with norm speakers, which we refer to as bias. In this work, we aim to reduce bias against different age groups and non-native speakers of Dutch. For an end-to-end (E2E) ASR system, we use state-of-the-art speed perturbation and spectral augmentation as data augmentation techniques and explore Vocal Tract Length Normalization (VTLN) to normalise for spectral differences due to differences in anatomy. The combination of data augmentation and VTLN reduced the average WER and bias across various diverse speaker groups by 6.9 3.9 improving performance of Mandarin Chinese child speech, thus, showing generalisability across languages

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2022

Non-Parallel Voice Conversion for ASR Augmentation

Automatic speech recognition (ASR) needs to be robust to speaker differe...
research
01/14/2022

Investigation of Data Augmentation Techniques for Disordered Speech Recognition

Disordered speech recognition is a highly challenging task. The underlyi...
research
05/13/2022

Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition

Despite the rapid progress of automatic speech recognition (ASR) technol...
research
10/02/2021

Significance of Data Augmentation for Improving Cleft Lip and Palate Speech Recognition

The automatic recognition of pathological speech, particularly from chil...
research
04/15/2022

Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals

Speech impairments due to cerebral lesions and degenerative disorders ca...
research
08/15/2018

Tensor models for linguistics pitch curve data of native speakers of Afrikaans

We use tensor analysis techniques for high-dimensional data to gain insi...
research
08/08/2020

Speaker discrimination in humans and machines: Effects of speaking style variability

Does speaking style variation affect humans' ability to distinguish indi...

Please sign up or login with your details

Forgot password? Click here to reset