MIT-QCRI Arabic Dialect Identification System for the 2017 Multi-Genre Broadcast Challenge

08/28/2017
by   Suwon Shon, et al.
0

In order to successfully annotate the Arabic speech con- tent found in open-domain media broadcasts, it is essential to be able to process a diverse set of Arabic dialects. For the 2017 Multi-Genre Broadcast challenge (MGB-3) there were two possible tasks: Arabic speech recognition, and Arabic Dialect Identification (ADI). In this paper, we describe our efforts to create an ADI system for the MGB-3 challenge, with the goal of distinguishing amongst four major Arabic dialects, as well as Modern Standard Arabic. Our research fo- cused on dialect variability and domain mismatches between the training and test domain. In order to achieve a robust ADI system, we explored both Siamese neural network models to learn similarity and dissimilarities among Arabic dialects, as well as i-vector post-processing to adapt domain mismatches. Both Acoustic and linguistic features were used for the final MGB-3 submissions, with the best primary system achieving 75 set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2015

Automatic Dialect Detection in Arabic Broadcast Speech

We investigate different approaches for dialect identification in Arabic...
research
09/21/2017

Speech Recognition Challenge in the Wild: Arabic MGB-3

This paper describes the Arabic MGB-3 Challenge - Arabic Speech Recognit...
research
06/01/2023

On the Robustness of Arabic Speech Dialect Identification

Arabic dialect identification (ADI) tools are an important part of the l...
research
03/12/2018

Convolutional Neural Networks and Language Embeddings for End-to-End Dialect Recognition

Dialect identification (DID) is a special case of general language ident...
research
09/29/2017

UTD-CRSS Submission for MGB-3 Arabic Dialect Identification: Front-end and Back-end Advancements on Broadcast Speech

This study presents systems submitted by the University of Texas at Dall...
research
05/10/2021

Similarities between Arabic Dialects: Investigating Geographical Proximity

The automatic classification of Arabic dialects is an ongoing research c...
research
07/11/2013

Genetic approach for arabic part of speech tagging

With the growing number of textual resources available, the ability to u...

Please sign up or login with your details

Forgot password? Click here to reset