On Using Monolingual Corpora in Neural Machine Translation

03/11/2015
by   Caglar Gulcehre, et al.
0

Recent work on end-to-end neural network-based architectures for machine translation has shown promising results for En-Fr and En-De translation. Arguably, one of the major factors behind this success has been the availability of high quality parallel corpora. In this work, we investigate how to leverage abundant monolingual corpora for neural machine translation. Compared to a phrase-based and hierarchical baseline, we obtain up to 1.96 BLEU improvement on the low-resource language pair Turkish-English, and 1.59 BLEU on the focused domain task of Chinese-English chat messages. While our method was initially targeted toward such tasks with less parallel data, we show that it also extends to high resource languages such as Cs-En and De-En where we obtain an improvement of 0.39 and 0.47 BLEU scores over the neural machine translation baselines, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2019

A Survey of Methods to Leverage Monolingual Data in Low-resource Neural Machine Translation

Neural machine translation has become the state-of-the-art for language ...
research
10/30/2018

Machine Translation between Vietnamese and English: an Empirical Study

Machine translation is shifting to an end-to-end approach based on deep ...
research
09/07/2018

Logographic Subword Model for Neural Machine Translation

A novel logographic subword model is proposed to reinterpret logograms a...
research
08/21/2023

An Effective Method using Phrase Mechanism in Neural Machine Translation

Machine Translation is one of the essential tasks in Natural Language Pr...
research
05/10/2022

ParaCotta: Synthetic Multilingual Paraphrase Corpora from the Most Diverse Translation Sample Pair

We release our synthetic parallel paraphrase corpus across 17 languages:...
research
04/20/2018

Phrase-Based & Neural Unsupervised Machine Translation

Machine translation systems achieve near human-level performance on some...
research
10/31/2017

Unsupervised Machine Translation Using Monolingual Corpora Only

Machine translation has recently achieved impressive performance thanks ...

Please sign up or login with your details

Forgot password? Click here to reset