Small Batch Sizes Improve Training of Low-Resource Neural MT

03/20/2022
by   Àlex R. Atrio, et al.
0

We study the role of an essential hyper-parameter that governs the training of Transformers for neural machine translation in a low-resource setting: the batch size. Using theoretical insights and experimental evidence, we argue against the widespread belief that batch size should be set as large as allowed by the memory of the GPUs. We show that in a low-resource setting, a smaller batch size leads to higher scores in a shorter training time, and argue that this is due to better regularization of the gradients during training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

MetaMT,a MetaLearning Method Leveraging Multiple Domain Data for Low Resource Machine Translation

Manipulating training data leads to robust neural models for MT....
research
04/30/2020

Simulated Multiple Reference Training Improves Low-Resource Machine Translation

Many valid translations exist for a given sentence, and yet machine tran...
research
05/01/2020

Structured Tuning for Semantic Role Labeling

Recent neural network-driven semantic role labeling (SRL) systems have s...
research
03/20/2021

The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation

This paper evaluates the performance of several modern subword segmentat...
research
03/25/2022

Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation

Subword regularizations use multiple subword segmentations during traini...
research
10/30/2017

Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German

The goal of this work is to design a machine translation system for a lo...
research
05/18/2020

Physics-informed Neural Networks for Solving Inverse Problems of Nonlinear Biot's Equations: Batch Training

In biomedical engineering, earthquake prediction, and underground energy...

Please sign up or login with your details

Forgot password? Click here to reset