English Accent Accuracy Analysis in a State-of-the-Art Automatic Speech Recognition System

05/09/2021
by   Guillermo Cámbara, et al.
0

Nowadays, research in speech technologies has gotten a lot out thanks to recently created public domain corpora that contain thousands of recording hours. These large amounts of data are very helpful for training the new complex models based on deep learning technologies. However, the lack of dialectal diversity in a corpus is known to cause performance biases in speech systems, mainly for underrepresented dialects. In this work, we propose to evaluate a state-of-the-art automatic speech recognition (ASR) deep learning-based model, using unseen data from a corpus with a wide variety of labeled English accents from different countries around the world. The model has been trained with 44.5K hours of English speech from an open access corpus called Multilingual LibriSpeech, showing remarkable results in popular benchmarks. We test the accuracy of such ASR against samples extracted from another public corpus that is continuously growing, the Common Voice dataset. Then, we present graphically the accuracy in terms of Word Error Rate of each of the different English included accents, showing that there is indeed an accuracy bias in terms of accentual variety, favoring the accents most prevalent in the training corpus.

READ FULL TEXT
research
06/27/2022

TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline

This paper introduces a new corpus of Mandarin-English code-switching sp...
research
12/13/2019

Common Voice: A Massively-Multilingual Speech Corpus

The Common Voice corpus is a massively-multilingual collection of transc...
research
05/25/2020

FT Speech: Danish Parliament Speech Corpus

This paper introduces FT Speech, a new speech corpus created from the re...
research
10/14/2021

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Automatic Speech recognition (ASR) is a complex and challenging task. In...
research
02/10/2021

NUVA: A Naming Utterance Verifier for Aphasia Treatment

Anomia (word-finding difficulties) is the hallmark of aphasia, an acquir...
research
03/29/2022

Earnings-22: A Practical Benchmark for Accents in the Wild

Modern automatic speech recognition (ASR) systems have achieved superhum...
research
12/11/2019

Leveraging End-to-End Speech Recognition with Neural Architecture Search

Deep neural networks (DNNs) have been demonstrated to outperform many tr...

Please sign up or login with your details

Forgot password? Click here to reset