DeepAI AI Chat
Log In Sign Up

When and Why are Pre-trainedWord Embeddings Useful for Neural Machine Translation?

by   Ye Qi, et al.
Carnegie Mellon University

The performance of Neural Machine Translation (NMT) systems often suffers in low-resource scenarios where sufficiently large-scale parallel corpora cannot be obtained. Pre-trained word embeddings have proven to be invaluable for improving performance in natural language analysis tasks, which often suffer from paucity of data. However, their utility for NMT has not been extensively explored. In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks. We show that such embeddings can be surprisingly effective in some cases -- providing gains of up to 20 BLEU points in the most favorable setting.


page 1

page 2

page 3

page 4


When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?

The performance of Neural Machine Translation (NMT) systems often suffer...

How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation?

Using pre-trained word embeddings as input layer is a common practice in...

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation

The success of bidirectional encoders using masked language models, such...

Debiasing Word Embeddings Improves Multimodal Machine Translation

In recent years, pretrained word embeddings have proved useful for multi...

Multimodal Machine Translation with Embedding Prediction

Multimodal machine translation is an attractive application of neural ma...

From the Paft to the Fiiture: a Fully Automatic NMT and Word Embeddings Method for OCR Post-Correction

A great deal of historical corpora suffer from errors introduced by the ...

Better Neural Machine Translation by Extracting Linguistic Information from BERT

Adding linguistic information (syntax or semantics) to neural machine tr...