Extended Parallel Corpus for Amharic-English Machine Translation

This paper describes the acquisition, preprocessing, segmentation, and alignment of an Amharic-English parallel corpus. It will be useful for machine translation of an under-resourced language, Amharic. The corpus is larger than previously compiled corpora; it is released for research purposes. We trained neural machine translation and phrase-based statistical machine translation models using the corpus. In the automatic evaluation, neural machine translation models outperform phrase-based statistical machine translation models.

READ FULL TEXT

page 1

page 3

page 5

page 7

page 9

page 10

research
06/06/2022

MorisienMT: A Dataset for Mauritian Creole Machine Translation

In this paper, we describe MorisienMT, a dataset for benchmarking machin...
research
06/12/2017

Six Challenges for Neural Machine Translation

We explore six challenges for neural machine translation: domain mismatc...
research
10/04/2016

Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions

In this paper we provide the largest published comparison of translation...
research
07/28/2020

Preparation of Sentiment tagged Parallel Corpus and Testing its effect on Machine Translation

In the current work, we explore the enrichment in the machine translatio...
research
04/05/2020

Reference Language based Unsupervised Neural Machine Translation

Exploiting common language as an auxiliary for better translation has a ...
research
10/19/2018

Impact of Corpora Quality on Neural Machine Translation

Large parallel corpora that are automatically obtained from the web, doc...
research
06/23/2020

Keyframe Segmentation and Positional Encoding for Video-guided Machine Translation Challenge 2020

Video-guided machine translation as one of multimodal neural machine tra...

Please sign up or login with your details

Forgot password? Click here to reset