Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature Review

by   Mikel K. Ngueajio, et al.

Speech is the fundamental means of communication between humans. The advent of AI and sophisticated speech technologies have led to the rapid proliferation of human-to-computer-based interactions, fueled primarily by Automatic Speech Recognition (ASR) systems. ASR systems normally take human speech in the form of audio and convert it into words, but for some users, it cannot decode the speech, and any output text is filled with errors that are incomprehensible to the human reader. These systems do not work equally for everyone and actually hinder the productivity of some users. In this paper, we present research that addresses ASR biases against gender, race, and the sick and disabled, while exploring studies that propose ASR debiasing techniques for mitigating these discriminations. We also discuss techniques for designing a more accessible and inclusive ASR technology. For each approach surveyed, we also provide a summary of the investigation and methods applied, the ASR systems and corpora used, and the research findings, and highlight their strengths and/or weaknesses. Finally, we propose future opportunities for Natural Language Processing researchers to explore in the next level creation of ASR technologies.


page 1

page 2

page 3

page 4


Quantifying Bias in Automatic Speech Recognition

Automatic speech recognition (ASR) systems promise to deliver objective ...

Gender Representation in French Broadcast Corpora and Its Impact on ASR Performance

This paper analyzes the gender representation in four major corpora of F...

Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR

Despite the fact that variation is a fundamental characteristic of natur...

Can Voice Assistants Be Microaggressors? Cross-Race Psychological Responses to Failures of Automatic Speech Recognition

Language technologies have a racial bias, committing greater errors for ...

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data

Building inclusive speech recognition systems is a crucial step towards ...

Minuteman: Machine and Human Joining Forces in Meeting Summarization

Many meetings require creating a meeting summary to keep everyone up to ...

A Persian ASR-based SER: Modification of Sharif Emotional Speech Database and Investigation of Persian Text Corpora

Speech Emotion Recognition (SER) is one of the essential perceptual meth...

Please sign up or login with your details

Forgot password? Click here to reset