Low Resourced Machine Translation via Morpho-syntactic Modeling: The Case of Dialectal Arabic

12/18/2017
by   Alexander Erdmann, et al.
0

We present the second ever evaluated Arabic dialect-to-dialect machine translation effort, and the first to leverage external resources beyond a small parallel corpus. The subject has not previously received serious attention due to lack of naturally occurring parallel data; yet its importance is evidenced by dialectal Arabic's wide usage and breadth of inter-dialect variation, comparable to that of Romance languages. Our results suggest that modeling morphology and syntax significantly improves dialect-to-dialect translation, though optimizing such data-sparse models requires consideration of the linguistic differences between dialects and the nature of available data and resources. On a single-reference blind test set where untranslated input scores 6.5 BLEU and a model trained only on parallel data reaches 14.6, pivot techniques and morphosyntactic modeling significantly improve performance to 17.5.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2018

A Recipe for Arabic-English Neural Machine Translation

In this paper, we present a recipe for building a good Arabic-English ne...
research
09/07/2015

Exploiting Out-of-Domain Data Sources for Dialectal Arabic Statistical Machine Translation

Statistical machine translation for dialectal Arabic is characterized by...
research
07/14/2019

Simple Automatic Post-editing for Arabic-Japanese Machine Translation

A common bottleneck for developing machine translation (MT) systems for ...
research
03/12/2021

Automatic Romanization of Arabic Bibliographic Records

International library standards require cataloguers to tediously input R...
research
12/26/2019

Amharic-Arabic Neural Machine Translation

Many automatic translation works have been addressed between major Europ...
research
11/06/2015

Multi-lingual Geoparsing based on Machine Translation

Our method for multi-lingual geoparsing uses monolingual tools and resou...
research
10/20/2022

The VolcTrans System for WMT22 Multilingual Machine Translation Task

This report describes our VolcTrans system for the WMT22 shared task on ...

Please sign up or login with your details

Forgot password? Click here to reset