UML: A Universal Monolingual Output Layer for Multilingual ASR

02/22/2023
by   Chao Zhang, et al.
0

Word-piece models (WPMs) are commonly used subword units in state-of-the-art end-to-end automatic speech recognition (ASR) systems. For multilingual ASR, due to the differences in written scripts across languages, multilingual WPMs bring the challenges of having overly large output layers and scaling to more languages. In this work, we propose a universal monolingual output layer (UML) to address such problems. Instead of one output node for only one WPM, UML re-associates each output node with multiple WPMs, one for each language, and results in a smaller monolingual output layer shared across languages. Consequently, the UML enables to switch in the interpretation of each output node depending on the language of the input speech. Experimental results on an 11-language voice search task demonstrated the feasibility of using UML for high-quality and high-efficiency multilingual streaming ASR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/11/2019

Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model

Multilingual end-to-end (E2E) models have shown great promise in expansi...
research
08/03/2021

A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English

We study training a single end-to-end (E2E) automatic speech recognition...
research
07/08/2020

Streaming End-to-End Bilingual ASR Systems with Joint Language Identification

Multilingual ASR technology simplifies model training and deployment, bu...
research
01/19/2019

Towards Universal End-to-End Affect Recognition from Multilingual Speech by ConvNets

We propose an end-to-end affect recognition approach using a Convolution...
research
11/10/2022

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

End-to-end multilingual ASR has become more appealing because of several...
research
08/29/2022

A Language Agnostic Multilingual Streaming On-Device ASR System

On-device end-to-end (E2E) models have shown improvements over a convent...
research
05/31/2021

Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR

With the advent of globalization, there is an increasing demand for mult...

Please sign up or login with your details

Forgot password? Click here to reset