On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition

11/01/2018
by   Zhiping Zeng, et al.
0

Code-switching (CS) refers to a linguistic phenomenon where a speaker uses different languages in an utterance or between alternating utterances. In this work, we study end-to-end (E2E) approaches to the Mandarin-English code-switching speech recognition (CSSR) task. We first examine the effectiveness of using data augmentation and byte-pair encoding (BPE) subword units. More importantly, we propose a multitask learning recipe, where a language identification task is explicitly learned in addition to the E2E speech recognition task. Furthermore, we introduce an efficient word vocabulary expansion method for language modeling to alleviate data sparsity issues under the code-switching scenario. Experimental results on the SEAME data, a Mandarin-English CS corpus, demonstrate the effectiveness of the proposed methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2021

Integrating Knowledge in End-to-End Automatic Speech Recognition for Mandarin-English Code-Switching

Code-Switching (CS) is a common linguistic phenomenon in multilingual co...
research
02/19/2020

Rnn-transducer with language bias for end-to-end Mandarin-English code-switching speech recognition

Recently, language identity information has been utilized to improve the...
research
10/20/2022

Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS

Current end-to-end code-switching Text-to-Speech (TTS) can already gener...
research
10/07/2021

Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models

Code-switching (CS) is common in daily conversations where more than one...
research
07/15/2019

Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data

End-to-end (E2E) systems are fast replacing the conventional systems in ...
research
11/09/2017

Language Modeling for Code-Switched Data: Challenges and Approaches

Lately, the problem of code-switching has gained a lot of attention and ...
research
04/11/2022

End-to-End Speech Translation for Code Switched Speech

Code switching (CS) refers to the phenomenon of interchangeably using wo...

Please sign up or login with your details

Forgot password? Click here to reset