DeepAI AI Chat
Log In Sign Up

Extended Parallel Corpus for Amharic-English Machine Translation

This paper describes the acquisition, preprocessing, segmentation, and alignment of an Amharic-English parallel corpus. It will be useful for machine translation of an under-resourced language, Amharic. The corpus is larger than previously compiled corpora; it is released for research purposes. We trained neural machine translation and phrase-based statistical machine translation models using the corpus. In the automatic evaluation, neural machine translation models outperform phrase-based statistical machine translation models.


page 1

page 3

page 5

page 7

page 9

page 10


MorisienMT: A Dataset for Mauritian Creole Machine Translation

In this paper, we describe MorisienMT, a dataset for benchmarking machin...

Six Challenges for Neural Machine Translation

We explore six challenges for neural machine translation: domain mismatc...

Improving Machine Translation with Phrase Pair Injection and Corpus Filtering

In this paper, we show that the combination of Phrase Pair Injection and...

Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions

In this paper we provide the largest published comparison of translation...

Preparation of Sentiment tagged Parallel Corpus and Testing its effect on Machine Translation

In the current work, we explore the enrichment in the machine translatio...

Reference Language based Unsupervised Neural Machine Translation

Exploiting common language as an auxiliary for better translation has a ...

Impact of Corpora Quality on Neural Machine Translation

Large parallel corpora that are automatically obtained from the web, doc...