Fast Training of NMT Model with Data Sorting

08/16/2023
by   Daniela N. Rim, et al.
0

The Transformer model has revolutionized Natural Language Processing tasks such as Neural Machine Translation, and many efforts have been made to study the Transformer architecture, which increased its efficiency and accuracy. One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computational burden. To tackle this, we propose an algorithm that sorts translation sentence pairs based on their length before batching, minimizing the waste of computing power. Since the amount of sorting could violate the independent and identically distributed (i.i.d) data assumption, we sort the data partially. In experiments, we apply the proposed method to English-Korean and English-Luganda language pairs for machine translation and show that there are gains in computational time while maintaining the performance. Our method is independent of architectures, so that it can be easily integrated into any training process with flexible data lengths.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2020

Towards Machine Translation for the Kurdish Language

Machine translation is the task of translating texts from one language t...
research
09/19/2018

NICT's Neural and Statistical Machine Translation Systems for the WMT18 News Translation Task

This paper presents the NICT's participation to the WMT18 shared news tr...
research
03/29/2021

English-Twi Parallel Corpus for Machine Translation

We present a parallel machine translation training corpus for English an...
research
08/31/2018

The MeMAD Submission to the WMT18 Multimodal Translation Task

This paper describes the MeMAD project entry to the WMT Multimodal Machi...
research
12/31/2021

How do lexical semantics affect translation? An empirical study

Neural machine translation (NMT) systems aim to map text from one langua...
research
05/30/2022

Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation

Unlike literal expressions, idioms' meanings do not directly follow from...
research
09/21/2020

Alleviating the Inequality of Attention Heads for Neural Machine Translation

Recent studies show that the attention heads in Transformer are not equa...

Please sign up or login with your details

Forgot password? Click here to reset