Training Neural Machine Translation (NMT) Models using Tensor Train Decomposition on TensorFlow (T3F)

11/05/2019
by   Amelia Drew, et al.
0

We implement a Tensor Train layer in the TensorFlow Neural Machine Translation (NMT) model using the t3f library. We perform training runs on the IWSLT English-Vietnamese '15 and WMT German-English '16 datasets with learning rates ∈{0.0004,0.0008,0.0012}, maximum ranks ∈{2,4,8,16} and a range of core dimensions. We compare against a target BLEU test score of 24.0, obtained by our benchmark run. For the IWSLT English-Vietnamese training, we obtain BLEU test/dev scores of 24.0/21.9 and 24.2/21.9 using core dimensions (2, 2, 256) × (2, 2, 512) with learning rate 0.0012 and rank distributions (1,4,4,1) and (1,4,16,1) respectively. These runs use 113% and 397% of the flops of the benchmark run respectively. We find that, of the parameters surveyed, a higher learning rate and more `rectangular' core dimensions generally produce higher BLEU scores. For the WMT German-English dataset, we obtain BLEU scores of 24.0/23.8 using core dimensions (4, 4, 128) × (4, 4, 256) with learning rate 0.0012 and rank distribution (1,2,2,1). We discuss the potential for future optimization and application of Tensor Train decomposition to other NMT models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2020

Very Deep Transformers for Neural Machine Translation

We explore the application of very deep Transformer models for Neural Ma...
research
10/20/2016

Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016

This year, the Nara Institute of Science and Technology (NAIST)/Carnegie...
research
06/21/2020

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

In this paper, we propose a new adversarial augmentation method for Neur...
research
06/04/2020

Using Self-Training to Improve Back-Translation in Low Resource Neural Machine Translation

Improving neural machine translation (NMT) models using the back-transla...
research
04/09/2019

Text Repair Model for Neural Machine Translation

In this work, we train a text repair model as a post-processor for Neura...
research
11/10/2019

Translationese as a Language in "Multilingual" NMT

Machine translation has an undesirable propensity to produce "translatio...
research
06/14/2016

Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation

Neural machine translation (NMT) aims at solving machine translation (MT...

Please sign up or login with your details

Forgot password? Click here to reset