Aligned Cross Entropy for Non-Autoregressive Machine Translation

04/03/2020
by   Marjan Ghazvininejad, et al.
9

Non-autoregressive machine translation models significantly speed up decoding by allowing for parallel prediction of the entire target sequence. However, modeling word order is more challenging due to the lack of autoregressive factors in the model. This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order. In this paper, we propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models. AXE uses a differentiable dynamic program to assign loss based on the best possible monotonic alignment between target tokens and model predictions. AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks, while setting a new state of the art for non-autoregressive models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2021

Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation

We propose a new training objective named order-agnostic cross entropy (...
research
10/08/2022

Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation

Non-autoregressive translation (NAT) models are typically trained with t...
research
03/14/2023

Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy

Because of predicting all the target tokens in parallel, the non-autoreg...
research
10/08/2022

ngram-OAXE: Phrase-Based Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation

Recently, a new training oaxe loss has proven effective to ameliorate th...
research
10/20/2022

Multi-Granularity Optimization for Non-Autoregressive Translation

Despite low latency, non-autoregressive machine translation (NAT) suffer...
research
05/26/2023

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

Autoregressive language models are trained by minimizing the cross-entro...
research
06/30/2021

Mixed Cross Entropy Loss for Neural Machine Translation

In neural machine translation, cross entropy (CE) is the standard loss f...

Please sign up or login with your details

Forgot password? Click here to reset