Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

07/12/2023
by   Wenxuan Wang, et al.
0

Multilingual speech recognition for both monolingual and code-switching speech is a challenging task. Recently, based on the Mixture of Experts (MoE), many works have made good progress in multilingual and code-switching ASR, but present huge computational complexity with the increase of supported languages. In this work, we propose a computation-efficient network named Language-Routing Mixture of Experts (LR-MoE) for multilingual and code-switching ASR. LR-MoE extracts language-specific representations through the Mixture of Language Experts (MLE), which is guided to learn by a frame-wise language routing mechanism. The weight-shared frame-level language identification (LID) network is jointly trained as the shared pre-router of each MoE layer. Experiments show that the proposed method significantly improves multilingual and code-switching speech recognition performances over baseline with comparable computational efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2022

LAE: Language-Aware Encoder for Monolingual and Multilingual ASR

Despite the rapid progress in automatic speech recognition (ASR) researc...
research
06/02/2021

Dual Script E2E framework for Multilingual and Code-Switching ASR

India is home to multiple languages, and training automatic speech recog...
research
11/23/2021

SpeechMoE2: Mixture-of-Experts Model with Improved Routing

Mixture-of-experts based acoustic models with dynamic routing mechanisms...
research
03/01/2023

Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

We propose gated language experts to improve multilingual transformer tr...
research
06/29/2022

Language-specific Characteristic Assistance for Code-switching Speech Recognition

Dual-encoder structure successfully utilizes two language-specific encod...
research
09/18/2023

Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

Multilingual intelligent assistants, such as ChatGPT, have recently gain...
research
05/07/2021

SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts

Recently, Mixture of Experts (MoE) based Transformer has shown promising...

Please sign up or login with your details

Forgot password? Click here to reset