MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech Recognition

02/27/2023
by   Yoohwan Kwon, et al.
0

Multi-lingual speech recognition aims to distinguish linguistic expressions in different languages and integrate acoustic processing simultaneously. In contrast, current multi-lingual speech recognition research follows a language-aware paradigm, mainly targeted to improve recognition performance rather than discriminate language characteristics. In this paper, we present a multi-lingual speech recognition network named Mixture-of-Language-Expert(MoLE), which digests speech in a variety of languages. Specifically, MoLE analyzes linguistic expression from input speech in arbitrary languages, activating a language-specific expert with a lightweight language tokenizer. The tokenizer not only activates experts, but also estimates the reliability of the activation. Based on the reliability, the activated expert and the language-agnostic expert are aggregated to represent language-conditioned embedding for efficient speech recognition. Our proposed model is evaluated in 5 languages scenario, and the experimental results show that our structure is advantageous on multi-lingual recognition, especially for speech in low-resource language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2018

Sequence-based Multi-lingual Low Resource Speech Recognition

Techniques for multi-lingual and cross-lingual speech recognition can he...
research
01/01/2018

PronouncUR: An Urdu Pronunciation Lexicon Generator

State-of-the-art speech recognition systems rely heavily on three basic ...
research
04/08/2022

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Low resource speech recognition has been long-suffering from insufficien...
research
10/23/2020

Speech Activity Detection Based on Multilingual Speech Recognition System

To better model the contextual information and increase the generalizati...
research
12/10/2021

Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition

The sparsely-gated Mixture of Experts (MoE) can magnify a network capaci...
research
11/23/2021

SpeechMoE2: Mixture-of-Experts Model with Improved Routing

Mixture-of-experts based acoustic models with dynamic routing mechanisms...
research
09/25/2021

Topic Model Robustness to Automatic Speech Recognition Errors in Podcast Transcripts

For a multilingual podcast streaming service, it is critical to be able ...

Please sign up or login with your details

Forgot password? Click here to reset