A Stable and Effective Learning Strategy for Trainable Greedy Decoding

04/21/2018
by   Yun Chen, et al.
0

As a widely used approximate search strategy for neural network decoders, beam search generally outperforms simple greedy decoding on machine translation, but at substantial computational cost. In this paper, we propose a method by which we can train a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. The use of this trained actor makes it possible to achieve translation results with greedy decoding comparable to that which would otherwise be found only with more expensive beam search. To train this actor network, we introduce the use of a pseudo-parallel corpus built using the output beam search on a base model, ranked by a target quality metric like BLEU. Experiments on three parallel corpora and three translation system architectures (RNN-based, ConvS2S and Transformer) show that our method yields substantial improvements in translation quality and speed over each base system, with no additional data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2017

Trainable Greedy Decoding for Neural Machine Translation

Recent research in neural machine translation has largely focused on two...
research
06/22/2017

Neural Machine Translation with Gumbel-Greedy Decoding

Previous neural machine translation models used some heuristic search al...
research
04/12/2021

Machine Translation Decoding beyond Beam Search

Beam search is the go-to method for decoding auto-regressive machine tra...
research
10/05/2020

A Streaming Approach For Efficient Batched Beam Search

We propose an efficient batching strategy for variable-length decoding o...
research
08/28/2018

Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation

Beam search is widely used in neural machine translation, and usually im...
research
05/12/2016

Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model

Recent advances in conditional recurrent language modelling have mainly ...
research
08/01/2017

A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models

Beam search is a desirable choice of test-time decoding algorithm for ne...

Please sign up or login with your details

Forgot password? Click here to reset