On optimal transformer depth for low-resource language translation

04/09/2020
by   Elan Van Biljon, et al.
0

Transformers have shown great promise as an approach to Neural Machine Translation (NMT) for low-resource languages. However, at the same time, transformer models remain difficult to optimize and require careful tuning of hyper-parameters to be useful in this setting. Many NMT toolkits come with a set of default hyper-parameters, which researchers and practitioners often adopt for the sake of convenience and avoiding tuning. These configurations, however, have been optimized for large-scale machine translation data sets with several millions of parallel sentences for European languages like English and French. In this work, we find that the current trend in the field to use very large models is detrimental for low-resource languages, since it makes training more difficult and hurts overall performance, confirming previous observations. We see our work as complementary to the Masakhane project ("Masakhane" means "We Build Together" in isiZulu.) In this spirit, low-resource NMT systems are now being built by the community who needs them the most. However, many in the community still have very limited access to the type of computational resources required for building extremely large models promoted by industrial research. Therefore, by showing that transformer models perform well (and often best) at low-to-moderate depth, we hope to convince fellow researchers to devote less computational resources, as well as time, to exploring overly large models during the development of these systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2020

Optimizing Transformer for Low-Resource Neural Machine Translation

Language pairs with limited amounts of parallel data, also known as low-...
research
03/24/2021

Low-Resource Machine Translation for Low-Resource Languages: Leveraging Comparable Data, Code-Switching and Compute Resources

We conduct an empirical study of unsupervised neural machine translation...
research
02/01/2023

Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture

Transformers have achieved great success in machine translation, but tra...
research
06/29/2021

Neural Machine Translation for Low-Resource Languages: A Survey

Neural Machine Translation (NMT) has seen a tremendous spurt of growth i...
research
09/09/2021

A Large-Scale Study of Machine Translation in the Turkic Languages

Recent advances in neural machine translation (NMT) have pushed the qual...
research
04/09/2022

Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages

The last decade has witnessed enormous improvements in science and techn...
research
03/07/2021

Translating the Unseen? Yorùbá → English MT in Low-Resource, Morphologically-Unmarked Settings

Translating between languages where certain features are marked morpholo...

Please sign up or login with your details

Forgot password? Click here to reset