Marathi To English Neural Machine Translation With Near Perfect Corpus And Transformers

02/26/2020
by   Swapnil Ashok Jadhav, et al.
0

There have been very few attempts to benchmark performances of state-of-the-art algorithms for Neural Machine Translation task on Indian Languages. Google, Bing, Facebook and Yandex are some of the very few companies which have built translation systems for few of the Indian Languages. Among them, translation results from Google are supposed to be better, based on general inspection. Bing-Translator do not even support Marathi language which has around 95 million speakers and ranks 15th in the world in terms of combined primary and secondary speakers. In this exercise, we trained and compared variety of Neural Machine Marathi to English Translators trained with BERT-tokenizer by huggingface and various Transformer based architectures using Facebook's Fairseq platform with limited but almost correct parallel corpus to achieve better BLEU scores than Google on Tatoeba and Wikimedia open datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2022

Syntax-based data augmentation for Hungarian-English machine translation

We train Transformer-based neural machine translation models for Hungari...
research
02/02/2020

Neural Machine Translation System of Indic Languages – An Attention based Approach

Neural machine translation (NMT) is a recent and effective technique whi...
research
12/07/2019

PidginUNMT: Unsupervised Neural Machine Translation from West African Pidgin to English

Over 800 languages are spoken across West Africa. Despite the obvious di...
research
07/20/2020

Neural Machine Translation model for University Email Application

Machine translation has many applications such as news translation, emai...
research
03/24/2020

Towards Neural Machine Translation for Edoid Languages

Many Nigerian languages have relinquished their previous prestige and pu...
research
02/15/2021

Crowdsourcing Parallel Corpus for English-Oromo Neural Machine Translation using Community Engagement Platform

Even though Afaan Oromo is the most widely spoken language in the Cushit...
research
09/30/2021

Prose2Poem: The Blessing of Transformers in Translating Prose to Persian Poetry

Persian Poetry has consistently expressed its philosophy, wisdom, speech...

Please sign up or login with your details

Forgot password? Click here to reset