Learning to Multi-Task Learn for Better Neural Machine Translation

01/10/2020
by   Poorya Zaremoodi, et al.
0

Scarcity of parallel sentence pairs is a major challenge for training high quality neural machine translation (NMT) models in bilingually low-resource scenarios, as NMT is data-hungry. Multi-task learning is an elegant approach to inject linguistic-related inductive biases into NMT, using auxiliary syntactic and semantic tasks, to improve generalisation. The challenge, however, is to devise effective training schedules, prescribing when to make use of the auxiliary tasks during the training process to fill the knowledge gaps of the main translation task, a setting referred to as biased-MTL. Current approaches for the training schedule are based on hand-engineering heuristics, whose effectiveness vary in different MTL settings. We propose a novel framework for learning the training schedule, ie learning to multi-task learn, for the MTL setting of interest. We formulate the training schedule as a Markov decision process which paves the way to employ policy learning methods to learn the scheduling policy. We effectively and efficiently learn the training schedule policy within the imitation learning framework using an oracle policy algorithm that dynamically sets the importance weights of auxiliary tasks based on their contributions to the generalisability of the main NMT task. Experiments on low-resource NMT settings show the resulting automatically learned training schedulers are competitive with the best heuristics, and lead to up to +1.1 BLEU score improvements.

READ FULL TEXT
research
05/28/2019

Revisiting Low-Resource Neural Machine Translation: A Case Study

It has been shown that the performance of neural machine translation (NM...
research
10/04/2018

AutoLoss: Learning Discrete Schedules for Alternate Optimization

Many machine learning problems involve iteratively and alternately optim...
research
05/11/2018

Neural Machine Translation for Bilingually Scarce Scenarios: A Deep Multi-task Learning Approach

Neural machine translation requires large amounts of parallel training t...
research
10/06/2020

Multi-task Learning for Multilingual Neural Machine Translation

While monolingual data has been shown to be useful in improving bilingua...
research
05/26/2023

On the Copying Problem of Unsupervised NMT: A Training Schedule with a Language Discriminator Loss

Although unsupervised neural machine translation (UNMT) has achieved suc...
research
09/13/2019

Adaptive Scheduling for Multi-Task Learning

To train neural machine translation models simultaneously on multiple ta...
research
05/01/2020

Low Resource Multi-Task Sequence Tagging – Revisiting Dynamic Conditional Random Fields

We compare different models for low resource multi-task sequence tagging...

Please sign up or login with your details

Forgot password? Click here to reset