Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR

05/31/2021
by   Shammur Absar Chowdhury, et al.
5

With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (En) and French (Fr) languages. We evaluate the system performance handling: (i) monolingual (Ar, En and Fr); (ii) multi-dialectal (Modern Standard Arabic, along with dialectal variation such as Egyptian and Moroccan); (iii) code-switching – cross-lingual (Ar-En/Fr) and dialectal (MSA-Egyptian dialect) test cases, and compare with current state-of-the-art systems. Furthermore, we investigate the influence of different embedding/character representations including character vs word-piece; shared vs distinct input symbol per language. Our findings demonstrate the strength of such a model by outperforming state-of-the-art monolingual dialectal Arabic and code-switching Arabic ASR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2021

Arabic Code-Switching Speech Recognition using Monolingual Data

Code-switching in automatic speech recognition (ASR) is an important cha...
research
01/28/2022

Reducing language context confusion for end-to-end code-switching automatic speech recognition

Code-switching is about dealing with alternative languages in the commun...
research
02/22/2023

UML: A Universal Monolingual Output Layer for Multilingual ASR

Word-piece models (WPMs) are commonly used subword units in state-of-the...
research
02/06/2020

Irony Detection in a Multilingual Context

This paper proposes the first multilingual (French, English and Arabic) ...
research
09/27/2019

End-to-End Code-Switching ASR for Low-Resourced Language Pairs

Despite the significant progress in end-to-end (E2E) automatic speech re...
research
02/27/2023

Diacritic Recognition Performance in Arabic ASR

We present an analysis of diacritic recognition performance in Arabic Au...
research
09/20/2023

Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition

Crafting an effective Automatic Speech Recognition (ASR) solution for di...

Please sign up or login with your details

Forgot password? Click here to reset