Dual Script E2E framework for Multilingual and Code-Switching ASR

06/02/2021
by   Mari Ganesh Kumar, et al.
4

India is home to multiple languages, and training automatic speech recognition (ASR) systems for languages is challenging. Over time, each language has adopted words from other languages, such as English, leading to code-mixing. Most Indian languages also have their own unique scripts, which poses a major limitation in training multilingual and code-switching ASR systems. Inspired by results in text-to-speech synthesis, in this work, we use an in-house rule-based phoneme-level common label set (CLS) representation to train multilingual and code-switching ASR for Indian languages. We propose two end-to-end (E2E) ASR systems. In the first system, the E2E model is trained on the CLS representation, and we use a novel data-driven back-end to recover the native language script. In the second system, we propose a modification to the E2E model, wherein the CLS representation and the native language characters are used simultaneously for training. We show our results on the multilingual and code-switching tasks of the Indic ASR Challenge 2021. Our best results achieve 6 system for the multilingual and code-switching tasks, respectively, on the challenge development data.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/01/2021

Multilingual and code-switching ASR challenges for low resource Indian languages

Recently, there is increasing interest in multilingual automatic speech ...
07/13/2021

A Configurable Multilingual Model is All You Need to Recognize All Languages

Multilingual automatic speech recognition (ASR) models have shown great ...
08/06/2020

Phonological Features for 0-shot Multilingual Speech Synthesis

Code-switching—the intra-utterance use of multiple languages—is prevalen...
05/31/2021

Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR

With the advent of globalization, there is an increasing demand for mult...
05/01/2022

Bilingual End-to-End ASR with Byte-Level Subwords

In this paper, we investigate how the output representation of an end-to...
11/13/2017

Multilingual Adaptation of RNN Based ASR Systems

A large amount of data is required for automatic speech recognition (ASR...
04/14/2022

Applying Feature Underspecified Lexicon Phonological Features in Multilingual Text-to-Speech

This study investigates whether the phonological features derived from t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.