Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition

06/14/2021
by   Andrew Slottje, et al.
0

Modeling code-switched speech is an important problem in automatic speech recognition (ASR). Labeled code-switched data are rare, so monolingual data are often used to model code-switched speech. These monolingual data may be more closely matched to one of the languages in the code-switch pair. We show that such asymmetry can bias prediction toward the better-matched language and degrade overall model performance. To address this issue, we propose a semi-supervised approach for code-switched ASR. We consider the case of English-Mandarin code-switching, and the problem of using monolingual data to build bilingual "transcription models” for annotation of unlabeled code-switched data. We first build multiple transcription models so that their individual predictions are variously biased toward either English or Mandarin. We then combine these biased transcriptions using confidence-based selection. This strategy generates a superior transcript for semi-supervised training, and obtains a 19 relies on a transcription model built with only the best-matched monolingual data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2022

Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation

Code-switching describes the practice of using more than one language in...
research
06/14/2023

Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources

Multilingual Automatic Speech Recognition (ASR) models are capable of tr...
research
06/01/2020

Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

Recently, there has been significant progress made in Automatic Speech R...
research
06/16/2018

Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

In this paper, we present our overall efforts to improve the performance...
research
11/29/2021

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization

Conversational bilingual speech encompasses three types of utterances: t...
research
05/30/2019

Lattice-based lightly-supervised acoustic model training

In the broadcast domain there is an abundance of related text data and p...
research
10/28/2018

Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training

We focus on the problem of language modeling for code-switched language,...

Please sign up or login with your details

Forgot password? Click here to reset