Towards Neural Machine Translation with Partially Aligned Corpora

11/03/2017
by   Yining Wang, et al.
0

While neural machine translation (NMT) has become the new paradigm, the parameter optimization requires large-scale parallel data which is scarce in many domains and language pairs. In this paper, we address a new translation scenario in which there only exists monolingual corpora and phrase pairs. We propose a new method towards translation with partially aligned sentence pairs which are derived from the phrase pairs and monolingual corpora. To make full use of the partially aligned corpora, we adapt the conventional NMT training method in two aspects. On one hand, different generation strategies are designed for aligned and unaligned target words. On the other hand, a different objective function is designed to model the partially aligned parts. The experiments demonstrate that our method can achieve a relatively good result in such a translation scenario, and tiny bitexts can boost translation quality to a large extent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2021

Synthesizing Monolingual Data for Neural Machine Translation

In neural machine translation (NMT), monolingual data in the target lang...
research
06/15/2016

Semi-Supervised Learning for Neural Machine Translation

While end-to-end neural machine translation (NMT) has made remarkable pr...
research
01/26/2021

Neural machine translation, corpus and frugality

In machine translation field, in both academia and industry, there is a ...
research
10/04/2016

Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions

In this paper we provide the largest published comparison of translation...
research
03/04/2022

EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation

Complete Multi-lingual Neural Machine Translation (C-MNMT) achieves supe...
research
05/10/2018

First Experiments with Neural Translation of Informal to Formal Mathematics

We report on our first experiments to train deep neural networks that au...
research
10/19/2018

Impact of Corpora Quality on Neural Machine Translation

Large parallel corpora that are automatically obtained from the web, doc...

Please sign up or login with your details

Forgot password? Click here to reset