Training Neural Machine Translation using Word Embedding-based Loss

07/30/2018
by   Katsuki Chousa, et al.
0

In neural machine translation (NMT), the computational cost at the output layer increases with the size of the target-side vocabulary. Using a limited-size vocabulary instead may cause a significant decrease in translation quality. This trade-off is derived from a softmax-based loss function that handles in-dictionary words independently, in which word similarity is not considered. In this paper, we propose a novel NMT loss function that includes word similarity in forms of distances in a word embedding space. The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation. In experiments using ASPEC Japanese-to-English and IWSLT17 English-to-French data sets, the proposed method showed improvements against a standard NMT baseline in both datasets; especially with IWSLT17 En-Fr, it achieved up to +1.72 in BLEU and +1.99 in METEOR. When the target-side vocabulary was very limited to 1,000 words, the proposed method demonstrated a substantial gain, +1.72 in METEOR with ASPEC Ja-En.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2018

Japanese Predicate Conjugation for Neural Machine Translation

Neural machine translation (NMT) has a drawback in that can generate onl...
research
10/30/2014

Addressing the Rare Word Problem in Neural Machine Translation

Neural Machine Translation (NMT) is a new approach to machine translatio...
research
12/05/2017

Neural Machine Translation by Generating Multiple Linguistic Factors

Factored neural machine translation (FNMT) is founded on the idea of usi...
research
12/20/2018

How Much Does Tokenization Affect in Neural Machine Translation?

Tokenization or segmentation is a wide concept that covers simple proces...
research
12/20/2018

How Much Does Tokenization Affect Neural Machine Translation?

Tokenization or segmentation is a wide concept that covers simple proces...
research
06/12/2017

Attention-based Vocabulary Selection for NMT Decoding

Neural Machine Translation (NMT) models usually use large target vocabul...
research
07/01/2022

Reduce Indonesian Vocabularies with an Indonesian Sub-word Separator

Indonesian is an agglutinative language since it has a compounding proce...

Please sign up or login with your details

Forgot password? Click here to reset