Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU

05/04/2017
by   Jacob Devlin, et al.
0

Attentional sequence-to-sequence models have become the new standard for machine translation, but one challenge of such models is a significant increase in training and decoding cost compared to phrase-based systems. Here, we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughput close to that of a phrasal decoder. We approach this problem from two angles: First, we describe several techniques for speeding up an NMT beam search decoder, which obtain a 4.4x speedup over a very efficient baseline decoder without changing the decoder output. Second, we propose a simple but powerful network architecture which uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked fully-connected layers applied at every timestep. This architecture achieves similar accuracy to a deep recurrent model, at a small fraction of the training and decoding cost. By combining these techniques, our best system achieves a very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014, while decoding at 100 words/sec on single-threaded CPU. We believe this is the best published accuracy/speed trade-off of an NMT system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2017

Improving Neural Machine Translation through Phrase-based Forced Decoding

Compared to traditional statistical machine translation (SMT), neural ma...
research
08/25/2018

Exploring Recombination for Efficient Decoding of Neural Machine Translation

In Neural Machine Translation (NMT), the decoder can capture the feature...
research
05/15/2016

Syntactically Guided Neural Machine Translation

We investigate the use of hierarchical phrase-based SMT lattices in end-...
research
09/26/2016

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Neural Machine Translation (NMT) is an end-to-end learning approach for ...
research
06/07/2016

Memory-enhanced Decoder for Neural Machine Translation

We propose to enhance the RNN decoder in a neural machine translator (NM...
research
02/02/2019

An end-to-end Generative Retrieval Method for Sponsored Search Engine --Decoding Efficiently into a Closed Target Domain

In this paper, we present a generative retrieval method for sponsored se...
research
11/07/2016

A Convolutional Encoder Model for Neural Machine Translation

The prevalent approach to neural machine translation relies on bi-direct...

Please sign up or login with your details

Forgot password? Click here to reset