Massive Exploration of Neural Machine Translation Architectures

03/11/2017
by   Denny Britz, et al.
0

Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users. One major drawback of current architectures is that they are expensive to train, typically requiring days to weeks of GPU time to converge. This makes exhaustive hyperparameter search, as is commonly done with other neural network architectures, prohibitively expensive. In this work, we present the first large-scale analysis of NMT architecture hyperparameters. We report empirical results and variance numbers for several hundred experimental runs, corresponding to over 250,000 GPU hours on the standard WMT English to German translation task. Our experiments lead to novel insights and practical advice for building and extending NMT architectures. As part of this contribution, we release an open-source NMT framework that enables researchers to easily experiment with novel techniques and reproduce state of the art results.

READ FULL TEXT
research
10/30/2018

Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation

Recent work achieved remarkable results in training neural machine trans...
research
12/15/2017

Sockeye: A Toolkit for Neural Machine Translation

We describe Sockeye (version 1.12), an open-source sequence-to-sequence ...
research
05/05/2018

Exploring Hyper-Parameter Optimization for Neural Machine Translation on GPU Architectures

Neural machine translation (NMT) has been accelerated by deep learning n...
research
09/18/2017

Toward a full-scale neural machine translation in production: the Booking.com use case

While some remarkable progress has been made in neural machine translati...
research
05/24/2019

An Analysis of Source-Side Grammatical Errors in NMT

The quality of Neural Machine Translation (NMT) has been shown to signif...
research
10/18/2016

SYSTRAN's Pure Neural Machine Translation Systems

Since the first online demonstration of Neural Machine Translation (NMT)...
research
09/19/2017

Dynamic Oracle for Neural Machine Translation in Decoding Phase

The past several years have witnessed the rapid progress of end-to-end N...

Please sign up or login with your details

Forgot password? Click here to reset