DeepAI AI Chat
Log In Sign Up

Extended Parallel Corpus for Amharic-English Machine Translation

This paper describes the acquisition, preprocessing, segmentation, and alignment of an Amharic-English parallel corpus. It will be useful for machine translation of an under-resourced language, Amharic. The corpus is larger than previously compiled corpora; it is released for research purposes. We trained neural machine translation and phrase-based statistical machine translation models using the corpus. In the automatic evaluation, neural machine translation models outperform phrase-based statistical machine translation models.

READ FULL TEXT

page 1

page 3

page 5

page 7

page 9

page 10

06/06/2022

MorisienMT: A Dataset for Mauritian Creole Machine Translation

In this paper, we describe MorisienMT, a dataset for benchmarking machin...
06/12/2017

Six Challenges for Neural Machine Translation

We explore six challenges for neural machine translation: domain mismatc...
01/19/2023

Improving Machine Translation with Phrase Pair Injection and Corpus Filtering

In this paper, we show that the combination of Phrase Pair Injection and...
10/04/2016

Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions

In this paper we provide the largest published comparison of translation...
07/28/2020

Preparation of Sentiment tagged Parallel Corpus and Testing its effect on Machine Translation

In the current work, we explore the enrichment in the machine translatio...
04/05/2020

Reference Language based Unsupervised Neural Machine Translation

Exploiting common language as an auxiliary for better translation has a ...
10/19/2018

Impact of Corpora Quality on Neural Machine Translation

Large parallel corpora that are automatically obtained from the web, doc...