Black-box Adaptation of ASR for Accented Speech

06/24/2020
by   Kartik Khandelwal, et al.
0

We introduce the problem of adapting a black-box, cloud-based ASR system to speech from a target accent. While leading online ASR services obtain impressive performance on main-stream accents, they perform poorly on sub-populations - we observed that the word error rate (WER) achieved by Google's ASR API on Indian accents is almost twice the WER on US accents. Existing adaptation methods either require access to model parameters or overlay an error-correcting module on output transcripts. We highlight the need for correlating outputs with the original speech to fix accent errors. Accordingly, we propose a novel coupling of an open-source accent-tuned local model with the black-box service where the output from the service guides frame-level inference in the local model. Our fine-grained merging algorithm is better at fixing accent errors than existing word-level combination strategies. Experiments on Indian and Australian accents with three leading ASR models as service, show that we achieve as much as 28 both the local and service models.

READ FULL TEXT
research
10/19/2021

Speech Pattern based Black-box Model Watermarking for Automatic Speech Recognition

As an effective method for intellectual property (IP) protection, model ...
research
03/13/2020

ASR Error Correction and Domain Adaptation Using Machine Translation

Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are...
research
08/08/2020

Word Error Rate Estimation Without ASR Output: e-WER2

Measuring the performance of automatic speech recognition (ASR) systems ...
research
05/20/2018

Targeted Adversarial Examples for Black Box Audio Systems

The application of deep recurrent networks to audio transcription has le...
research
06/01/2023

Adapting an Unadaptable ASR System

As speech recognition model sizes and training data requirements grow, i...
research
01/14/2021

WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm

Automatic Speech Recognition (ASR) systems are evaluated using Word Erro...
research
08/29/2021

Beyond Model Extraction: Imitation Attack for Black-Box NLP APIs

Machine-learning-as-a-service (MLaaS) has attracted millions of users to...

Please sign up or login with your details

Forgot password? Click here to reset