DeepAI AI Chat
Log In Sign Up

An Evaluation of Persian-English Machine Translation Datasets with Transformers

by   Amir Sartipi, et al.
University of Isfahan

Nowadays, many researchers are focusing their attention on the subject of machine translation (MT). However, Persian machine translation has remained unexplored despite a vast amount of research being conducted in languages with high resources, such as English. Moreover, while a substantial amount of research has been undertaken in statistical machine translation for some datasets in Persian, there is currently no standard baseline for transformer-based text2text models on each corpus. This study collected and analysed the most popular and valuable parallel corpora, which were used for Persian-English translation. Furthermore, we fine-tuned and evaluated two state-of-the-art attention-based seq2seq models on each dataset separately (48 results). We hope this paper will assist researchers in comparing their Persian to English and vice versa machine translation results to a standard baseline.


page 8

page 9


MorisienMT: A Dataset for Mauritian Creole Machine Translation

In this paper, we describe MorisienMT, a dataset for benchmarking machin...

A Parallel Corpus of Translationese

We describe a set of bilingual English--French and English--German paral...

Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture

Transformers have achieved great success in machine translation, but tra...

On the Strengths of Cross-Attention in Pretrained Transformers for Machine Translation

We study the power of cross-attention in the Transformer architecture wi...

New Approach to translation of Isolated Units in English-Korean Machine Translation

It is the most effective way for quick translation of tremendous amount ...