Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources

06/14/2023
by   Kunal Dhawan, et al.
0

Multilingual Automatic Speech Recognition (ASR) models are capable of transcribing audios across multiple languages, eliminating the need for separate models. In addition, they can perform Language Identification (LID) and handle code-switched speech. However, training these models requires special code-switch and multilingual speech corpora which are sparsely available. In this paper, we evaluate different approaches towards training of bilingual as well as code-switched ASR models using purely monolingual data sources. We introduce the concept of aggregate tokenizers that differs from the current prevalent technique of generating LIDs at the boundaries of monolingual samples and produces LID for each emitted token instead. We compare bilingual and monolingual model performance, showcase the efficacy of aggregate tokenizers, present a synthetic code-switched ASR data generation technique and demonstrate the effectiveness of the proposed code-switched ASR models for the tasks of speech recognition and spoken language identification.

READ FULL TEXT
research
03/30/2022

Code Switched and Code Mixed Speech Recognition for Indic languages

Training multilingual automatic speech recognition (ASR) systems is chal...
research
11/29/2021

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization

Conversational bilingual speech encompasses three types of utterances: t...
research
06/09/2020

Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition

Recognizing code-switched speech is challenging for Automatic Speech Rec...
research
06/14/2021

Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition

Modeling code-switched speech is an important problem in automatic speec...
research
04/08/2019

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

The lack of code-switch training data is one of the major concerns in th...
research
08/11/2023

Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss

We introduce a bilingual solution to support English as secondary locale...
research
02/24/2023

Improving Massively Multilingual ASR With Auxiliary CTC Objectives

Multilingual Automatic Speech Recognition (ASR) models have extended the...

Please sign up or login with your details

Forgot password? Click here to reset