Lossless Speedup of Autoregressive Translation with Generalized Aggressive Decoding

03/30/2022
by   Heming Xia, et al.
0

In this paper, we propose Generalized Aggressive Decoding (GAD) – a novel decoding paradigm for speeding up autoregressive translation without quality loss, through the collaboration of autoregressive and non-autoregressive translation (NAT) of the Transformer. At each decoding iteration, GAD aggressively decodes a number of tokens in parallel as a draft with NAT and then verifies them in the autoregressive manner, where only the tokens that pass the verification are kept as decoded tokens. GAD can achieve the same performance as autoregressive translation but much more efficiently because both NAT drafting and autoregressive verification are fast due to parallel computing. We conduct experiments in the WMT14 English-German translation task and confirm that the vanilla GAD yields exactly the same results as greedy decoding with an around 3x speedup, and that its variant (GAD++) with an advanced verification strategy not only outperforms the greedy translation and even achieves the comparable translation quality with the beam search result, but also further improves the decoding speed, resulting in an around 5x speedup over autoregressive translation. Our models and codes are available at https://github.com/hemingkx/Generalized-Aggressive-Decoding.

READ FULL TEXT
research
05/20/2022

Lossless Acceleration for Seq2seq Generation with Aggressive Decoding

We study lossless acceleration for seq2seq generation with a novel decod...
research
05/17/2023

Accelerating Transformer Inference for Translation via Parallel Decoding

Autoregressive decoding limits the efficiency of transformers for Machin...
research
04/07/2020

Improving Fluency of Non-Autoregressive Machine Translation

Non-autoregressive (nAR) models for machine translation (MT) manifest su...
research
09/23/2021

The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21

This paper describes the Volctrans' submission to the WMT21 news transla...
research
05/31/2021

Effective Batching for Recurrent Neural Network Grammars

As a language model that integrates traditional symbolic operations and ...
research
05/25/2023

Revisiting Non-Autoregressive Translation at Scale

In real-world systems, scaling has been critical for improving the trans...
research
10/19/2022

Hybrid-Regressive Neural Machine Translation

In this work, we empirically confirm that non-autoregressive translation...

Please sign up or login with your details

Forgot password? Click here to reset