Controlling the Output Length of Neural Machine Translation

10/23/2019
by   Surafel Melaku Lakew, et al.
0

The recent advances introduced by neural machine translation (NMT) are rapidly expanding the application fields of machine translation, as well as reshaping the quality level to be targeted. In particular, if translations have to fit some given layout, quality should not only be measured in terms of adequacy and fluency, but also length. Exemplary cases are the translation of document files, subtitles, and scripts for dubbing, where the output length should ideally be as close as possible to the length of the input text. This paper addresses for the first time, to the best of our knowledge, the problem of controlling the output length in NMT. We investigate two methods for biasing the output length with a transformer architecture: i) conditioning the output to a given target-source length-ratio class and ii) enriching the transformer positional embedding with length information. Our experiments show that both methods can induce the network to generate shorter translations, as well as acquiring interpretable linguistic skills.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/16/2021

Isometric MT: Neural Machine Translation for Automatic Dubbing

Automatic dubbing (AD) is among the use cases where translations should ...
research
12/24/2020

Why Neural Machine Translation Prefers Empty Outputs

We investigate why neural machine translation (NMT) systems assign high ...
research
11/03/2018

Identifying and Controlling Important Neurons in Neural Machine Translation

Neural machine translation (NMT) models learn representations containing...
research
10/01/2019

When and Why is Document-level Context Useful in Neural Machine Translation?

Document-level context has received lots of attention for compensating n...
research
05/28/2021

Reinforcement Learning for on-line Sequence Transformation

A number of problems in the processing of sound and natural language, as...
research
10/18/2019

Controlling Utterance Length in NMT-based Word Segmentation with Attention

One of the basic tasks of computational language documentation (CLD) is ...
research
08/28/2018

A Tree-based Decoder for Neural Machine Translation

Recent advances in Neural Machine Translation (NMT) show that adding syn...

Please sign up or login with your details

Forgot password? Click here to reset