Semi-Autoregressive Neural Machine Translation

08/26/2018
by   Chunqi Wang, et al.
0

Existing approaches to neural machine translation are typically autoregressive models. While these models attain state-of-the-art translation quality, they are suffering from low parallelizability and thus slow at decoding long sequences. In this paper, we propose a novel model for fast sequence generation --- the semi-autoregressive Transformer (SAT). The SAT keeps the autoregressive property in global but relieves in local and thus are able to produce multiple successive words in parallel at each time step. Experiments conducted on English-German and Chinese-English translation tasks show that the SAT achieves a good balance between translation quality and decoding speed. On WMT'14 English-German translation, the SAT achieves 5.58× speedup while maintaining 88% translation quality, significantly better than the previous non-autoregressive methods. When produces two words at each time step, the SAT is almost lossless (only 1% degeneration in BLEU score).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2018

End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification

Autoregressive decoding is the only part of sequence-to-sequence models ...
research
09/23/2021

The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21

This paper describes the Volctrans' submission to the WMT21 news transla...
research
06/06/2019

Syntactically Supervised Transformers for Faster Neural Machine Translation

Standard decoders for neural machine translation autoregressively genera...
research
07/17/2020

Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation

Non-autoregressive translation (NAT) achieves faster inference speed but...
research
11/07/2017

Non-Autoregressive Neural Machine Translation

Existing approaches to neural machine translation condition each output ...
research
11/13/2020

EDITOR: an Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints

We introduce an Edit-Based Transformer with Repositioning (EDITOR), whic...
research
10/13/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

While large scale pre-trained language models such as BERT have achieved...

Please sign up or login with your details

Forgot password? Click here to reset