Improving Low Resource Code-switched ASR using Augmented Code-switched TTS

10/12/2020
by   Yash Sharma, et al.
0

Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide. End-to-end ASR systems are a natural modeling choice due to their ease of use and superior performance in monolingual settings. However, it is well known that end-to-end systems require large amounts of labeled speech. In this work, we investigate improving code-switched ASR in low resource settings via data augmentation using code-switched text-to-speech (TTS) synthesis. We propose two targeted techniques to effectively leverage TTS speech samples: 1) Mixup, an existing technique to create new training samples via linear interpolation of existing samples, applied to TTS and real speech samples, and 2) a new loss function, used in conjunction with TTS samples, to encourage code-switched predictions. We report significant improvements in ASR performance achieving absolute word error rate (WER) reductions of up to 5 switching using our proposed techniques on a Hindi-English code-switched ASR task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2021

Multilingual and code-switching ASR challenges for low resource Indian languages

Recently, there is increasing interest in multilingual automatic speech ...
research
11/04/2020

Data Augmentation for End-to-end Code-switching Speech Recognition

Training a code-switching end-to-end automatic speech recognition (ASR) ...
research
10/06/2021

Integrating Categorical Features in End-to-End ASR

All-neural, end-to-end ASR systems gained rapid interest from the speech...
research
10/19/2020

Reduce and Reconstruct: Improving Low-resource End-to-end ASR Via Reconstruction Using Reduced Vocabularies

End-to-end automatic speech recognition (ASR) systems are increasingly b...
research
03/12/2021

Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition

With the rapid development of speech assistants, adapting server-intende...
research
06/09/2020

Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition

Recognizing code-switched speech is challenging for Automatic Speech Rec...
research
06/03/2023

Adapting Pretrained ASR Models to Low-resource Clinical Speech using Epistemic Uncertainty-based Data Selection

While there has been significant progress in ASR, African-accented clini...

Please sign up or login with your details

Forgot password? Click here to reset