Beyond Weight Tying: Learning Joint Input-Output Embeddings for Neural Machine Translation

08/31/2018
by   Nikolaos Pappas, et al.
0

Tying the weights of the target word embeddings with the target word classifiers of neural machine translation models leads to faster training and often to better translation quality. Given the success of this parameter sharing, we investigate other forms of sharing in between no sharing and hard equality of parameters. In particular, we propose a structure-aware output layer which captures the semantic structure of the output space of words within a joint input-output embedding. The model is a generalized form of weight tying which shares parameters but allows learning a more flexible relationship with input word embeddings and allows the effective capacity of the output layer to be controlled. In addition, the model shares weights across output classifiers and translation contexts which allows it to better leverage prior knowledge about them. Our evaluation on English-to-Finnish and English-to-German datasets shows the effectiveness of the method against strong encoder-decoder baselines trained with or without weight tying.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2019

ReWE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems

Regularization of neural machine translation is still a significant prob...
research
01/22/2020

Normalization of Input-output Shared Embeddings in Text Generation Models

Neural Network based models have been state-of-the-art models for variou...
research
11/27/2019

DeFINE: DEep Factorized INput Word Embeddings for Neural Sequence Modeling

For sequence models with large word-level vocabularies, a majority of ne...
research
11/15/2017

Bridging Source and Target Word Embeddings for Neural Machine Translation

Neural machine translation systems encode a source sequence into a vecto...
research
06/09/2016

Linguistic Input Features Improve Neural Machine Translation

Neural machine translation has recently achieved impressive results, whi...
research
05/31/2019

Examining Structure of Word Embeddings with PCA

In this paper we compare structure of Czech word embeddings for English-...
research
02/10/2017

Local System Voting Feature for Machine Translation System Combination

In this paper, we enhance the traditional confusion network system combi...

Please sign up or login with your details

Forgot password? Click here to reset