Modeling Latent Sentence Structure in Neural Machine Translation

by   Joost Bastings, et al.
University of Amsterdam

Recently it was shown that linguistic structure predicted by a supervised parser can be beneficial for neural machine translation (NMT). In this work we investigate a more challenging setup: we incorporate sentence structure as a latent variable in a standard NMT encoder-decoder and induce it in such a way as to benefit the translation task. We consider German-English and Japanese-English translation benchmarks and observe that when using RNN encoders the model makes no or very limited use of the structure induction apparatus. In contrast, CNN and word-embedding-based encoders rely on latent graphs and force them to encode useful, potentially long-distance, dependencies.


Chunk-Based Bi-Scale Decoder for Neural Machine Translation

In typical neural machine translation (NMT), the decoder generates a sen...

A neural interlingua for multilingual machine translation

We incorporate an explicit neural interlingua into a multilingual encode...

Exploiting Cross-Sentence Context for Neural Machine Translation

In translation, considering the document as a whole can help to resolve ...

Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks

Semantic representations have long been argued as potentially useful for...

A Convolutional Encoder Model for Neural Machine Translation

The prevalent approach to neural machine translation relies on bi-direct...

Encoders Help You Disambiguate Word Senses in Neural Machine Translation

Neural machine translation (NMT) has achieved new state-of-the-art perfo...

Unsupervised Neural Machine Translation with Weight Sharing

Unsupervised neural machine translation (NMT) is a recently proposed app...

Please sign up or login with your details

Forgot password? Click here to reset