Fast Neural Machine Translation Implementation

05/24/2018
by   Hieu Hoang, et al.
0

This paper describes the submissions to the efficiency track for GPUs by members of the University of Edinburgh, Adam Mickiewicz University, Tilde and University of Alicante. We focus on efficient implementation of the recurrent deep-learning model as implemented in Amun, the fast inference engine for neural machine translation. We improve the performance with an efficient mini-batching algorithm and by fusing the softmax operation with k-best extraction algorithm.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset