Context-Dependent Word Representation for Neural Machine Translation

07/03/2016
by   Heeyoul Choi, et al.
0

We first observe a potential weakness of continuous vector representations of symbols in neural machine translation. That is, the continuous vector representation, or a word embedding vector, of a symbol encodes multiple dimensions of similarity, equivalent to encoding more than one meaning of the word. This has the consequence that the encoder and decoder recurrent networks in neural machine translation need to spend substantial amount of their capacity in disambiguating source and target words based on the context which is defined by a source sentence. Based on this observation, in this paper we propose to contextualize the word embedding vectors using a nonlinear bag-of-words representation of the source sentence. Additionally, we propose to represent special tokens (such as numbers, proper nouns and acronyms) with typed symbols to facilitate translating those words that are not well-suited to be translated via continuous vectors. The experiments on En-Fr and En-De reveal that the proposed approaches of contextualization and symbolization improves the translation quality of neural machine translation systems significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2014

Neural Machine Translation by Jointly Learning to Align and Translate

Neural machine translation is a recently proposed approach to machine tr...
research
11/15/2017

Bridging Source and Target Word Embeddings for Neural Machine Translation

Neural machine translation systems encode a source sequence into a vecto...
research
10/20/2020

Word Shape Matters: Robust Machine Translation with Visual Embedding

Neural machine translation has achieved remarkable empirical performance...
research
11/08/2017

Learning K-way D-dimensional Discrete Code For Compact Embedding Representations

Embedding methods such as word embedding have become pillars for many ap...
research
07/18/2016

Neural Machine Translation with Recurrent Attention Modeling

Knowing which words have been attended to in previous time steps while g...
research
04/27/2017

A GRU-Gated Attention Model for Neural Machine Translation

Neural machine translation (NMT) heavily relies on an attention network ...
research
09/01/2018

Contextual Encoding for Translation Quality Estimation

The task of word-level quality estimation (QE) consists of taking a sour...

Please sign up or login with your details

Forgot password? Click here to reset