Diacritization of Maghrebi Arabic Sub-Dialects

10/15/2018

∙

Diacritization process attempt to restore the short vowels in Arabic written text; which typically are omitted. This process is essential for applications such as Text-to-Speech (TTS). While diacritization of Modern Standard Arabic (MSA) still holds the line share, research on dialectal Arabic (DA) diacritization is very limited. In this paper, we present our contribution and results on the automatic diacritization of two sub-dialects of Maghrebi Arabic, namely Tunisian and Moroccan, using a character-level deep neural network architecture that stacks two bi-LSTM layers over a CRF output layer. The model achieves word error rate of 2.7 respectively and is capable of implicitly identifying the sub-dialect of the input.

READ FULL TEXT

Diacritization of Maghrebi Arabic Sub-Dialects

Sign in with Google

Consider DeepAI Pro