Phrase-Based & Neural Unsupervised Machine Translation

04/20/2018
by   Guillaume Lample, et al.
0

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of bitexts, which hinders their applicability to the majority of language pairs. This work investigates how to learn to translate when having access to only large monolingual corpora in each language. We propose two model variants, a neural and a phrase-based model. Both versions leverage automatic generation of parallel data by backtranslating with a backward model operating in the other direction, and the denoising effect of a language model trained on the target side. These models are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters. On the widely used WMT14 English-French and WMT16 German-English benchmarks, our models respectively obtain 27.1 and 23.6 BLEU points without using a single parallel sentence, outperforming the state of the art by more than 11 BLEU points.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2015

On Using Monolingual Corpora in Neural Machine Translation

Recent work on end-to-end neural network-based architectures for machine...
research
10/31/2017

Unsupervised Machine Translation Using Monolingual Corpora Only

Machine translation has recently achieved impressive performance thanks ...
research
07/19/2021

Integrating Unsupervised Data Generation into Self-Supervised Neural Machine Translation for Low-Resource Languages

For most language combinations, parallel data is either scarce or simply...
research
09/04/2018

Unsupervised Statistical Machine Translation

While modern machine translation has relied on large parallel corpora, a...
research
04/30/2020

Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders

Text simplification (TS) rephrases long sentences into simplified varian...
research
10/30/2017

Unsupervised Neural Machine Translation

In spite of the recent success of neural machine translation (NMT) in st...
research
03/29/2016

Prepositional Attachment Disambiguation Using Bilingual Parsing and Alignments

In this paper, we attempt to solve the problem of Prepositional Phrase (...

Please sign up or login with your details

Forgot password? Click here to reset