Non-Autoregressive Machine Translation with Latent Alignments

04/16/2020
by   Chitwan Saharia, et al.
0

This paper investigates two latent alignment models for non-autoregressive machine translation, namely CTC and Imputer. CTC generates outputs in a single step, makes strong conditional independence assumptions about output variables, and marginalizes out latent alignments using dynamic programming. Imputer generates outputs in a constant number of steps, and approximately marginalizes out possible generation orders and latent alignments for training. These models are simpler than existing non-autoregressive methods, since they do not require output length prediction as a pre-process. In addition, our architecture is simpler than typical encoder-decoder architectures, since input-output cross attention is not used. On the competitive WMT'14 En→De task, our CTC model achieves 25.7 BLEU with a single generation step, while Imputer achieves 27.5 BLEU with 2 generation steps, and 28.0 BLEU with 4 generation steps. This compares favourably to the baseline autoregressive Transformer with 27.8 BLEU.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/18/2020

Glancing Transformer for Non-Autoregressive Neural Machine Translation

Non-autoregressive neural machine translation achieves remarkable infere...
02/20/2020

Imputer: Sequence Modelling via Imputation and Dynamic Programming

This paper presents the Imputer, a neural sequence model that generates ...
11/07/2017

Non-Autoregressive Neural Machine Translation

Existing approaches to neural machine translation condition each output ...
05/21/2022

Non-Autoregressive Neural Machine Translation: A Call for Clarity

Non-autoregressive approaches aim to improve the inference speed of tran...
04/19/2021

Can Latent Alignments Improve Autoregressive Machine Translation?

Latent alignment objectives such as CTC and AXE significantly improve no...
01/22/2020

Normalization of Input-output Shared Embeddings in Text Generation Models

Neural Network based models have been state-of-the-art models for variou...
03/09/2018

Fast Decoding in Sequence Models using Discrete Latent Variables

Autoregressive sequence models based on deep neural networks, such as RN...

Please sign up or login with your details

Forgot password? Click here to reset