Log In Sign Up

Improving Fluency of Non-Autoregressive Machine Translation

by   Zdeněk Kasner, et al.

Non-autoregressive (nAR) models for machine translation (MT) manifest superior decoding speed when compared to autoregressive (AR) models, at the expense of impaired fluency of their outputs. We improve the fluency of a nAR model with connectionist temporal classification (CTC) by employing additional features in the scoring model used during beam search decoding. Since the beam search decoding in our model only requires to run the network in a single forward pass, the decoding speed is still notably higher than in standard AR models. We train models for three language pairs: German, Czech, and Romanian from and into English. The results show that our proposed models can be more efficient in terms of decoding speed and still achieve a competitive BLEU score relative to AR models.


page 1

page 2

page 3

page 4


Lossless Speedup of Autoregressive Translation with Generalized Aggressive Decoding

In this paper, we propose Generalized Aggressive Decoding (GAD) – a nove...

Machine Translation Decoding beyond Beam Search

Beam search is the go-to method for decoding auto-regressive machine tra...

Enabling arbitrary translation objectives with Adaptive Tree Search

We introduce an adaptive tree search algorithm, that can find high-scori...

Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition

Modern non-autoregressive (NAR) speech recognition systems aim to accele...

Diformer: Directional Transformer for Neural Machine Translation

Autoregressive (AR) and Non-autoregressive (NAR) models have their own s...

Improving Top-K Decoding for Non-Autoregressive Semantic Parsing via Intent Conditioning

Semantic parsing (SP) is a core component of modern virtual assistants l...

Can Multilinguality benefit Non-autoregressive Machine Translation?

Non-autoregressive (NAR) machine translation has recently achieved signi...