Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

10/13/2021
by   Julia Kreutzer, et al.
8

Training data for machine translation (MT) is often sourced from a multitude of large corpora that are multi-faceted in nature, e.g. containing contents from multiple domains or different levels of quality or complexity. Naturally, these facets do not occur with equal frequency, nor are they equally important for the test scenario at hand. In this work, we propose to optimize this balance jointly with MT model parameters to relieve system developers from manual schedule design. A multi-armed bandit is trained to dynamically choose between facets in a way that is most beneficial for the MT system. We evaluate it on three different multi-facet applications: balancing translationese and natural training data, or data from multiple domains or multiple language pairs. We find that bandit learning leads to competitive MT systems across tasks, and our analysis provides insights into its learned strategies and the underlying data sets.

READ FULL TEXT
research
12/11/2019

MetaMT,a MetaLearning Method Leveraging Multiple Domain Data for Low Resource Machine Translation

Manipulating training data leads to robust neural models for MT....
research
08/17/2020

Using Subjective Logic to Estimate Uncertainty in Multi-Armed Bandit Problems

The multi-armed bandit problem is a classical decision-making problem wh...
research
10/20/2014

Using Mechanical Turk to Build Machine Translation Evaluation Sets

Building machine translation (MT) test sets is a relatively expensive ta...
research
11/07/2021

Variance-Aware Machine Translation Test Sets

We release 70 small and discriminative test sets for machine translation...
research
07/23/2021

Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits

We consider nonstationary multi-armed bandit problems where the model pa...
research
05/04/2022

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

Human-translated text displays distinct features from naturally written ...
research
12/01/2021

Learned Autoscaling for Cloud Microservices with Multi-Armed Bandits

As cloud applications shift from monolithic architectures to loosely cou...

Please sign up or login with your details

Forgot password? Click here to reset