Performance Disparities Between Accents in Automatic Speech Recognition

08/01/2022
by   Alex DiChristofano, et al.
0

Automatic speech recognition (ASR) services are ubiquitous, transforming speech into text for systems like Amazon's Alexa, Google's Assistant, and Microsoft's Cortana. However, researchers have identified biases in ASR performance between particular English language accents by racial group and by nationality. In this paper, we expand this discussion both qualitatively by relating it to historical precedent and quantitatively through a large-scale audit. Standardization of language and the use of language to maintain global and political power have played an important role in history, which we explain to show the parallels in the ways in which ASR services act on English language speakers today. Then, using a large and global data set of speech from The Speech Accent Archive which includes over 2,700 speakers of English born in 171 different countries, we perform an international audit of some of the most popular English ASR services. We show that performance disparities exist as a function of whether or not a speaker's first language is English and, even when controlling for multiple linguistic covariates, that these disparities have a statistically significant relationship to the political alignment of the speaker's birth country with respect to the United States' geopolitical power.

READ FULL TEXT
research
05/25/2023

Svarah: Evaluating English ASR Systems on Indian Accents

India is the second largest English-speaking country in the world with a...
research
03/31/2023

The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR

English is the most widely spoken language in the world, used daily by m...
research
02/26/2022

Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models

Speaker anonymization aims to protect the privacy of speakers while pres...
research
07/20/2023

A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos

Automatic speech recognition (ASR) systems are designed to transcribe sp...
research
05/22/2023

Debiased Automatic Speech Recognition for Dysarthric Speech via Sample Reweighting with Sample Affinity Test

Automatic speech recognition systems based on deep learning are mainly t...
research
12/22/2022

Pushing the performances of ASR models on English and Spanish accents

Speech to text models tend to be trained and evaluated against a single ...
research
07/03/2017

The Fall of the Empire: The Americanization of English

As global political preeminence gradually shifted from the United Kingdo...

Please sign up or login with your details

Forgot password? Click here to reset