Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

10/01/2019
by   Kenton Murray, et al.
0

Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model.

READ FULL TEXT
research
11/04/2020

Optimizing Transformer for Low-Resource Neural Machine Translation

Language pairs with limited amounts of parallel data, also known as low-...
research
08/11/2023

Optimizing transformer-based machine translation model for single GPU training: a hyperparameter ablation study

In machine translation tasks, the relationship between model complexity ...
research
09/01/2021

Survey of Low-Resource Machine Translation

We present a survey covering the state of the art in low-resource machin...
research
12/24/2022

Optimizing Deep Transformers for Chinese-Thai Low-Resource Translation

In this paper, we study the use of deep Transformer translation model fo...
research
06/07/2021

Lexicon Learning for Few-Shot Neural Sequence Modeling

Sequence-to-sequence transduction is the core problem in language proces...
research
01/30/2019

The Evolved Transformer

Recent works have highlighted the strengths of the Transformer architect...
research
07/17/2021

Dynamic Transformer for Efficient Machine Translation on Embedded Devices

The Transformer architecture is widely used for machine translation task...

Please sign up or login with your details

Forgot password? Click here to reset