DeepAI
Log In Sign Up

Automatic Parallel Corpus Creation for Hindi-English News Translation Task

01/24/2019
by   Aditya Kumar Pathak, et al.
0

The parallel corpus for multilingual NLP tasks, deep learning applications like Statistical Machine Translation Systems is very important. The parallel corpus of Hindi-English language pair available for news translation task till date is of very limited size as per the requirement of the systems are concerned. In this work we have developed an automatic parallel corpus generation system prototype, which creates Hindi-English parallel corpus for news translation task. Further to verify the quality of generated parallel corpus we have experimented by taking various performance metrics and the results are quite interesting.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/14/2018

Bianet: A Parallel News Corpus in Turkish, Kurdish and English

We present a new open-source parallel corpus consisting of news articles...
10/04/2020

Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus

Machine translation has been a major motivation of development in natura...
01/27/2020

PMIndia – A Collection of Parallel Corpora of Languages of India

Parallel text is required for building high-quality machine translation ...
04/25/2021

Potential Idiomatic Expression (PIE)-English: Corpus for Classes of Idioms

We present a fairly large, Potential Idiomatic Expression (PIE) dataset ...
10/05/2017

Phrase Pair Mappings for Hindi-English Statistical Machine Translation

In this paper, we present our work on the creation of lexical resources ...
08/05/2020

Designing the Business Conversation Corpus

While the progress of machine translation of written text has come far i...
07/14/2019

Simple Automatic Post-editing for Arabic-Japanese Machine Translation

A common bottleneck for developing machine translation (MT) systems for ...