On Using Very Large Target Vocabulary for Neural Machine Translation

12/05/2014
by   Sébastien Jean, et al.
0

Neural machine translation, a recently proposed approach to machine translation based purely on neural networks, has shown promising results compared to the existing approaches such as phrase-based statistical machine translation. Despite its recent success, neural machine translation has its limitation in handling a larger vocabulary, as training complexity as well as decoding complexity increase proportionally to the number of target words. In this paper, we propose a method that allows us to use a very large target vocabulary without increasing training complexity, based on importance sampling. We show that decoding can be efficiently done even with the model having a very large target vocabulary by selecting only a small subset of the whole target vocabulary. The models trained by the proposed approach are empirically found to outperform the baseline models with a small vocabulary as well as the LSTM-based neural machine translation models. Furthermore, when we use the ensemble of a few models with very large target vocabularies, we achieve the state-of-the-art translation performance (measured by BLEU) on the English->German translation and almost as high performance as state-of-the-art English->French translation system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2016

Vocabulary Manipulation for Neural Machine Translation

In order to capture rich language phenomena, neural machine translation ...
research
09/09/2018

Speeding Up Neural Machine Translation Decoding by Cube Pruning

Although neural machine translation has achieved promising results, it s...
research
10/01/2016

Vocabulary Selection Strategies for Neural Machine Translation

Classical translation models constrain the space of possible outputs by ...
research
11/07/2016

A Convolutional Encoder Model for Neural Machine Translation

The prevalent approach to neural machine translation relies on bi-direct...
research
04/23/2017

Neural Machine Translation via Binary Code Prediction

In this paper, we propose a new method for calculating the output layer ...
research
05/13/2022

The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation

Vocabulary selection, or lexical shortlisting, is a well-known technique...
research
01/06/2019

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

We address for the first time unsupervised training for a translation ta...

Please sign up or login with your details

Forgot password? Click here to reset