Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

11/20/2019
by   Junliang Guo, et al.
0

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same model configurations, a natural idea to improve the accuracy of NAT models is to transfer a well-trained AT model to an NAT model through fine-tuning. However, since AT and NAT models differ greatly in training strategy, straightforward fine-tuning does not work well. In this work, we introduce curriculum learning into fine-tuning for NAT. Specifically, we design a curriculum in the fine-tuning process to progressively switch the training from autoregressive generation to non-autoregressive generation. Experiments on four benchmark translation datasets show that the proposed method achieves good improvement (more than 1 BLEU score) over previous NAT baselines in terms of translation accuracy, and greatly speed up (more than 10 times) the inference process over AT baselines.

READ FULL TEXT
research
07/17/2020

Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation

Non-autoregressive translation (NAT) achieves faster inference speed but...
research
06/05/2019

Imitation Learning for Non-Autoregressive Neural Machine Translation

Non-autoregressive translation models (NAT) have achieved impressive inf...
research
09/15/2019

Hint-Based Training for Non-Autoregressive Machine Translation

Due to the unparallelizable nature of the autoregressive factorization, ...
research
03/01/2023

A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation

The effectiveness of Neural Machine Translation (NMT) models largely dep...
research
06/10/2023

Improving Non-autoregressive Translation Quality with Pretrained Language Model, Embedding Distillation and Upsampling Strategy for CTC

Non-autoregressive approaches aim to improve the inference speed of tran...
research
10/19/2022

Hybrid-Regressive Neural Machine Translation

In this work, we empirically confirm that non-autoregressive translation...
research
11/02/2020

Context-Aware Cross-Attention for Non-Autoregressive Translation

Non-autoregressive translation (NAT) significantly accelerates the infer...

Please sign up or login with your details

Forgot password? Click here to reset