Language-agnostic Code-Switching in End-To-End Speech Recognition

10/17/2022
by   Enes Yavuz Ugan, et al.
0

Code-Switching (CS) is referred to the phenomenon of alternately using words and phrases from different languages. While today's neural end-to-end (E2E) models deliver state-of-the-art performances on the task of automatic speech recognition (ASR) it is commonly known that these systems are very data-intensive. However, there is only a few transcribed and aligned CS speech available. To overcome this problem and train multilingual systems which can transcribe CS speech, we propose a simple yet effective data augmentation in which audio and corresponding labels of different source languages are concatenated. By using this training data, our E2E model improves on transcribing CS speech and improves performance over the multilingual model, as well. The results show that this augmentation technique can even improve the model's performance on inter-sentential language switches not seen during training by 5,03% WER.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2021

Integrating Knowledge in End-to-End Automatic Speech Recognition for Mandarin-English Code-Switching

Code-Switching (CS) is a common linguistic phenomenon in multilingual co...
research
10/04/2022

Code-Switching without Switching: Language Agnostic End-to-End Speech Translation

We propose a) a Language Agnostic end-to-end Speech Translation model (L...
research
04/11/2022

End-to-End Speech Translation for Code Switched Speech

Code switching (CS) refers to the phenomenon of interchangeably using wo...
research
10/07/2021

Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models

Code-switching (CS) is common in daily conversations where more than one...
research
01/07/2022

Code-Switching Text Augmentation for Multilingual Speech Processing

The pervasiveness of intra-utterance Code-switching (CS) in spoken conte...
research
10/26/2022

Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization

Code-switching (CS) refers to the phenomenon that languages switch withi...
research
05/30/2022

Adversarial synthesis based data-augmentation for code-switched spoken language identification

Spoken Language Identification (LID) is an important sub-task of Automat...

Please sign up or login with your details

Forgot password? Click here to reset