Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation

07/16/2022
by   Viet Anh Trinh, et al.
0

We present an approach to reduce the performance disparity between geographic regions without degrading performance on the overall user population for ASR. A popular approach is to fine-tune the model with data from regions where the ASR model has a higher word error rate (WER). However, when the ASR model is adapted to get better performance on these high-WER regions, its parameters wander from the previous optimal values, which can lead to worse performance in other regions. In our proposed method, we utilize the elastic weight consolidation (EWC) regularization loss to identify directions in parameters space along which the ASR weights can vary to improve for high-error regions, while still maintaining performance on the speaker population overall. Our results demonstrate that EWC can reduce the word error rate (WER) in the region with highest WER by 3.2 relative. We also evaluate the role of language and acoustic models in ASR fairness and propose a clustering algorithm to identify WER disparities based on geographic region.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2019

Predicting word error rate for reverberant speech

Reverberation negatively impacts the performance of automatic speech rec...
research
09/19/2021

Model-Based Approach for Measuring the Fairness in ASR

The issue of fairness arises when the automatic speech recognition (ASR)...
research
11/17/2020

Refining Automatic Speech Recognition System for older adults

Building a high quality automatic speech recognition (ASR) system with l...
research
10/03/2019

Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

The performances of automatic speech recognition (ASR) systems are usual...
research
07/20/2023

A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos

Automatic speech recognition (ASR) systems are designed to transcribe sp...
research
07/10/2019

Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition

Evolutionary stochastic gradient descent (ESGD) was proposed as a popula...
research
09/07/2022

Modeling Dependent Structure for Utterances in ASR Evaluation

The bootstrap resampling method has been popular for performing signific...

Please sign up or login with your details

Forgot password? Click here to reset