DARTS: Dialectal Arabic Transcription System

09/26/2019
by   Sameer Khurana, et al.
0

We present the speech to text transcription system, called DARTS, for low resource Egyptian Arabic dialect. We analyze the following; transfer learning from high resource broadcast domain to low-resource dialectal domain and semi-supervised learning where we use in-domain unlabeled audio data collected from YouTube. Key features of our system are: A deep neural network acoustic model that consists of a front end Convolutional Neural Network (CNN) followed by several layers of Time Delayed Neural Network (TDNN) and Long-Short Term Memory Recurrent Neural Network (LSTM); sequence discriminative training of the acoustic model; n-gram and recurrent neural network language model for decoding and N-best list rescoring. We show that a simple transfer learning method can achieve good results. The results are further improved by using unlabeled data from YouTube in a semi-supervised setup. Various systems are combined to give the final system that achieves the lowest word error on on the community standard Egyptian-Arabic speech dataset (MGB-3).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2021

Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages

In this paper, we propose a three-stage training methodology to improve ...
research
06/02/2021

Improving low-resource ASR performance with untranscribed out-of-domain data

Semi-supervised training (SST) is a common approach to leverage untransc...
research
01/13/2017

Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network

In this paper, we propose an efficient transfer leaning methods for trai...
research
12/19/2019

LSTM-TDNN with convolutional front-end for Dialect Identification in the 2019 Multi-Genre Broadcast Challenge

This paper presents a novel Dialect Identification (DID) system develope...
research
02/07/2016

Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings

One-hot CNN (convolutional neural network) has been shown to be effectiv...
research
09/16/2019

Fast transcription of speech in low-resource languages

We present software that, in only a few hours, transcribes forty hours o...
research
08/02/2021

Correcting Arabic Soft Spelling Mistakes using BiLSTM-based Machine Learning

Soft spelling errors are a class of spelling mistakes that is widespread...

Please sign up or login with your details

Forgot password? Click here to reset