A Stable and Effective Learning Strategy for Trainable Greedy Decoding

04/21/2018

∙

As a widely used approximate search strategy for neural network decoders, beam search generally outperforms simple greedy decoding on machine translation, but at substantial computational cost. In this paper, we propose a method by which we can train a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. The use of this trained actor makes it possible to achieve translation results with greedy decoding comparable to that which would otherwise be found only with more expensive beam search. To train this actor network, we introduce the use of a pseudo-parallel corpus built using the output beam search on a base model, ranked by a target quality metric like BLEU. Experiments on three parallel corpora and three translation system architectures (RNN-based, ConvS2S and Transformer) show that our method yields substantial improvements in translation quality and speed over each base system, with no additional data.

READ FULL TEXT

A Stable and Effective Learning Strategy for Trainable Greedy Decoding

Sign in with Google

Consider DeepAI Pro