Analyzing Architectures for Neural Machine Translation Using Low Computational Resources

11/06/2021
by   Aditya Mandke, et al.
0

With the recent developments in the field of Natural Language Processing, there has been a rise in the use of different architectures for Neural Machine Translation. Transformer architectures are used to achieve state-of-the-art accuracy, but they are very computationally expensive to train. Everyone cannot have such setups consisting of high-end GPUs and other resources. We train our models on low computational resources and investigate the results. As expected, transformers outperformed other architectures, but there were some surprising results. Transformers consisting of more encoders and decoders took more time to train but had fewer BLEU scores. LSTM performed well in the experiment and took comparatively less time to train than transformers, making it suitable to use in situations having time constraints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

PICT@DravidianLangTech-ACL2022: Neural Machine Translation On Dravidian Languages

This paper presents a summary of the findings that we obtained based on ...
research
01/01/2021

Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

The advent of the Transformer can arguably be described as a driving for...
research
10/06/2020

On the Sparsity of Neural Machine Translation Models

Modern neural machine translation (NMT) models employ a large number of ...
research
11/18/2021

Quality and Cost Trade-offs in Passage Re-ranking Task

Deep learning models named transformers achieved state-of-the-art result...
research
08/22/2018

Training Deeper Neural Machine Translation Models with Transparent Attention

While current state-of-the-art NMT models, such as RNN seq2seq and Trans...
research
11/18/2022

A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural Machine Translation

Neural Machine Translation (NMT) models from English to SPARQL are a pro...
research
02/02/2023

Mnemosyne: Learning to Train Transformers with Transformers

Training complex machine learning (ML) architectures requires a compute ...

Please sign up or login with your details

Forgot password? Click here to reset