An Efficient Character-Level Neural Machine Translation

by   Shenjian Zhao, et al.

Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems on the task of English-to-French translation. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose an efficient architecture to train a deep character-level neural machine translation by introducing a decimator and an interpolator. The decimator is used to sample the source sequence before encoding while the interpolator is used to resample after decoding. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is much faster and more memory-efficient in training than conventional character-based models. More interestingly, our model is able to translate the misspelled word like human beings.


page 1

page 2

page 3

page 4


Neural Machine Translation by Jointly Learning to Align and Translate

Neural machine translation is a recently proposed approach to machine tr...

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Recently, neural machine translation (NMT) has emerged as a powerful alt...

A Deep Memory-based Architecture for Sequence-to-Sequence Learning

We propose DEEPMEMORY, a novel deep architecture for sequence-to-sequenc...

Neural Language Correction with Character-Based Attention

Natural language correction has the potential to help language learners ...

Hierarchical Attention Transformer Architecture For Syntactic Spell Correction

The attention mechanisms are playing a boosting role in advancements in ...

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

We investigate the integration of a planning mechanism into an encoder-d...