A Deep Learning Approach for Similar Languages, Varieties and Dialects

01/02/2019
by   Vidya Prasad K, et al.
0

Deep learning mechanisms are prevailing approaches in recent days for the various tasks in natural language processing, speech recognition, image processing and many others. To leverage this we use deep learning based mechanism specifically Bidirectional- Long Short-Term Memory (B-LSTM) for the task of dialectic identification in Arabic and German broadcast speech and Long Short-Term Memory (LSTM) for discriminating between similar Languages. Two unique B-LSTM models are created using the Large-vocabulary Continuous Speech Recognition (LVCSR) based lexical features and a fixed length of 400 per utterance bottleneck features generated by i-vector framework. These models were evaluated on the VarDial 2017 datasets for the tasks Arabic, German dialect identification with dialects of Egyptian, Gulf, Levantine, North African, and MSA for Arabic and Basel, Bern, Lucerne, and Zurich for German. Also for the task of Discriminating between Similar Languages like Bosnian, Croatian and Serbian. The B-LSTM model showed accuracy of 0.246 on lexical features and accuracy of 0.577 bottleneck features of i-Vector framework.

READ FULL TEXT

page 9

page 11

page 13

research
09/18/2018

Language Identification with Deep Bottleneck Features

In this paper we proposed an end-to-end short utterances speech language...
research
09/01/2023

ALJP: An Arabic Legal Judgment Prediction in Personal Status Cases Using Machine Learning Models

Legal Judgment Prediction (LJP) aims to predict judgment outcomes based ...
research
04/03/2021

Sexism detection: The first corpus in Algerian dialect with a code-switching in Arabic/ French and English

In this paper, an approach for hate speech detection against women in Ar...
research
07/09/2018

A Combined CNN and LSTM Model for Arabic Sentiment Analysis

Deep neural networks have shown good data modelling capabilities when de...
research
09/20/2018

LSTM-based Whisper Detection

This article presents a whisper speech detector in the far-field domain....
research
12/13/2020

SPARTA: Speaker Profiling for ARabic TAlk

This paper proposes a novel approach to an automatic estimation of three...
research
05/07/2019

Variational Representation Learning for Vehicle Re-Identification

Vehicle Re-identification is attracting more and more attention in recen...

Please sign up or login with your details

Forgot password? Click here to reset