Glancing Transformer for Non-Autoregressive Neural Machine Translation

08/18/2020
by   Lihua Qian, et al.
0

Non-autoregressive neural machine translation achieves remarkable inference acceleration compared to autoregressive models. However, current non-autoregressive models still fall behind their autoregressive counterparts in prediction accuracy. We attribute the accuracy gaps to two disadvantages of non-autoregressive models: a) learning simultaneous generation under the overly strong conditional independence assumption; b) lacking explicit target language modeling. In this paper, we propose Glancing Transformer (GLAT) to address the above disadvantages, which reduces the difficulty of learning simultaneous generation and introduces explicit target language modeling in the non-autoregressive setting at the same time. Experiments on several benchmarks demonstrate that our approach significantly improves the accuracy of non-autoregressive models without sacrificing any inference efficiency. In particular, GLAT achieves 30.91 BLEU on WMT 2014 German-English, which narrows the gap between autoregressive models and non-autoregressive models to less than 0.5 BLEU score.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2021

Enriching Non-Autoregressive Transformer with Syntactic and SemanticStructures for Neural Machine Translation

The non-autoregressive models have boosted the efficiency of neural mach...
research
03/21/2021

Non-Autoregressive Translation by Learning Target Categorical Codes

Non-autoregressive Transformer is a promising text generation model. How...
research
04/16/2020

Non-Autoregressive Machine Translation with Latent Alignments

This paper investigates two latent alignment models for non-autoregressi...
research
03/13/2023

AMOM: Adaptive Masking over Masking for Conditional Masked Language Model

Transformer-based autoregressive (AR) methods have achieved appealing pe...
research
08/19/2021

MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive Machine Translation

Conditional masked language models (CMLM) have shown impressive progress...
research
06/26/2022

Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One

Autoregressive generative models are commonly used, especially for those...
research
12/29/2020

Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation

Recently, simultaneous translation has gathered a lot of attention since...

Please sign up or login with your details

Forgot password? Click here to reset