AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition

05/16/2020
by   Afroz Ahamad, et al.
0

Modern Automatic Speech Recognition (ASR) technology has evolved to identify the speech spoken by native speakers of a language very well. However, identification of the speech spoken by non-native speakers continues to be a major challenge for it. In this work, we first spell out the key requirements for creating a well-curated database of speech samples in non-native accents for training and testing robust ASR systems. We then introduce AccentDB, one such database that contains samples of 4 Indian-English accents collected by us, and a compilation of samples from 4 native-English, and a metropolitan Indian-English accent. We also present an analysis on separability of the collected accent data. Further, we present several accent classification models and evaluate them thoroughly against human-labelled accent classes. We test the generalization of our classifier models in a variety of setups of seen and unseen data. Finally, we introduce the task of accent neutralization of non-native accents to native accents using autoencoder models with task-specific architectures. Thus, our work aims to aid ASR systems at every stage of development with a database for training, classification models for feature augmentation, and neutralization systems for acoustic transformations of non-native accents of English.

READ FULL TEXT
research
03/01/2023

Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition

The awareness for biased ASR datasets or models has increased notably in...
research
10/01/2021

Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning

To address the performance gap of English ASR models on L2 English speak...
research
10/19/2021

AequeVox: Automated Fairness Testing of Speech Recognition Systems

Automatic Speech Recognition (ASR) systems have become ubiquitous. They ...
research
12/14/2020

REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

Accents mismatching is a critical problem for end-to-end ASR. This paper...
research
03/06/2021

JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and People-related Information Detection

We conducted a labeling work on a spoken Japanese dataset (I-JAS) for th...
research
07/20/2023

A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos

Automatic speech recognition (ASR) systems are designed to transcribe sp...
research
12/22/2022

Pushing the performances of ASR models on English and Spanish accents

Speech to text models tend to be trained and evaluated against a single ...

Please sign up or login with your details

Forgot password? Click here to reset