MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive Machine Translation

08/19/2021
by   Pan Xie, et al.
0

Conditional masked language models (CMLM) have shown impressive progress in non-autoregressive machine translation (NAT). They learn the conditional translation model by predicting the random masked subset in the target sentence. Based on the CMLM framework, we introduce Multi-view Subset Regularization (MvSR), a novel regularization method to improve the performance of the NAT model. Specifically, MvSR consists of two parts: (1) shared mask consistency: we forward the same target with different mask strategies, and encourage the predictions of shared mask positions to be consistent with each other. (2) model consistency, we maintain an exponential moving average of the model weights, and enforce the predictions to be consistent between the average model and the online model. Without changing the CMLM-based architecture, our approach achieves remarkable performance on three public benchmarks with 0.36-1.14 BLEU gains over previous NAT models. Moreover, compared with the stronger Transformer baseline, we reduce the gap to 0.01-0.44 BLEU scores on small datasets (WMT16 RO↔EN and IWSLT DE→EN).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2020

Glancing Transformer for Non-Autoregressive Neural Machine Translation

Non-autoregressive neural machine translation achieves remarkable infere...
research
04/19/2019

Constant-Time Machine Translation with Conditional Masked Language Models

Most machine translation systems generate text autoregressively, by sequ...
research
01/23/2020

Semi-Autoregressive Training Improves Mask-Predict Decoding

The recently proposed mask-predict decoding algorithm has narrowed the p...
research
10/25/2019

Fast Structured Decoding for Sequence Models

Autoregressive sequence models achieve state-of-the-art performance in d...
research
06/10/2023

Improving Non-autoregressive Translation Quality with Pretrained Language Model, Embedding Distillation and Upsampling Strategy for CTC

Non-autoregressive approaches aim to improve the inference speed of tran...
research
11/03/2020

Layer-Wise Multi-View Learning for Neural Machine Translation

Traditional neural machine translation is limited to the topmost encoder...
research
07/17/2020

Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation

Non-autoregressive translation (NAT) achieves faster inference speed but...

Please sign up or login with your details

Forgot password? Click here to reset