Machine Translation with Unsupervised Length-Constraints

04/07/2020
by   Jan Niehues, et al.
0

We have seen significant improvements in machine translation due to the usage of deep learning. While the improvements in translation quality are impressive, the encoder-decoder architecture enables many more possibilities. In this paper, we explore one of these, the generation of constraint translation. We focus on length constraints, which are essential if the translation should be displayed in a given format. In this work, we propose an end-to-end approach for this task. Compared to a traditional method that first translates and then performs sentence compression, the text compression is learned completely unsupervised. By combining the idea with zero-shot multilingual machine translation, we are also able to perform unsupervised monolingual sentence compression. In order to fulfill the length constraints, we investigated several methods to integrate the constraints into the model. Using the presented technique, we are able to significantly improve the translation quality under constraints. Furthermore, we are able to perform unsupervised monolingual sentence compression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2022

Improving Simultaneous Machine Translation with Monolingual Data

Simultaneous machine translation (SiMT) is usually done via sequence-lev...
research
04/23/2018

A neural interlingua for multilingual machine translation

We incorporate an explicit neural interlingua into a multilingual encode...
research
05/29/2020

Training Multilingual Machine Translation by Alternately Freezing Language-Specific Encoders-Decoders

We propose a modular architecture of language-specific encoder-decoders ...
research
10/10/2018

Improving Neural Text Simplification Model with Simplified Corpora

Text simplification (TS) can be viewed as monolingual translation task, ...
research
05/17/2023

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

The language-independency of encoded representations within multilingual...
research
02/17/2022

End-to-End Training of Both Translation Models in the Back-Translation Framework

Semi-supervised learning algorithms in neural machine translation (NMT) ...
research
10/18/2019

Controlling Utterance Length in NMT-based Word Segmentation with Attention

One of the basic tasks of computational language documentation (CLD) is ...

Please sign up or login with your details

Forgot password? Click here to reset