Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model

09/11/2019
by   Anjuli Kannan, et al.
26

Multilingual end-to-end (E2E) models have shown great promise in expansion of automatic speech recognition (ASR) coverage of the world's languages. They have shown improvement over monolingual systems, and have simplified training and serving by eliminating language-specific acoustic, pronunciation, and language models. This work presents an E2E multilingual system which is equipped to operate in low-latency interactive applications, as well as handle a key challenge of real world data: the imbalance in training data across languages. Using nine Indic languages, we compare a variety of techniques, and find that a combination of conditioning on a language vector and training language-specific adapter layers produces the best model. The resulting E2E multilingual model achieves a lower word error rate (WER) than both monolingual E2E models (eight of nine languages) and monolingual conventional systems (all nine languages).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2022

Code Switched and Code Mixed Speech Recognition for Indic languages

Training multilingual automatic speech recognition (ASR) systems is chal...
research
02/22/2023

UML: A Universal Monolingual Output Layer for Multilingual ASR

Word-piece models (WPMs) are commonly used subword units in state-of-the...
research
06/17/2019

Adversarial Training for Multilingual Acoustic Modeling

Multilingual training has been shown to improve acoustic modeling perfor...
research
07/26/2021

Multilingual Coreference Resolution with Harmonized Annotations

In this paper, we present coreference resolution experiments with a newl...
research
05/07/2021

Efficient Weight factorization for Multilingual Speech Recognition

End-to-end multilingual speech recognition involves using a single model...
research
06/27/2023

Confidence-based Ensembles of End-to-End Speech Recognition Models

The number of end-to-end speech recognition models grows every year. The...
research
10/15/2021

Multilingual Speech Recognition using Knowledge Transfer across Learning Processes

Multilingual end-to-end(E2E) models have shown a great potential in the ...

Please sign up or login with your details

Forgot password? Click here to reset