Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback

07/24/2017
by   Khanh Nguyen, et al.
0

Machine translation is a natural candidate problem for reinforcement learning from human feedback: users provide quick, dirty ratings on candidate translations to guide a system to improve. Yet, current neural machine translation training focuses on expensive human-generated reference translations. We describe a reinforcement learning algorithm that improves neural machine translation systems from simulated human feedback. Our algorithm combines the advantage actor-critic algorithm (Mnih et al., 2016) with the attention-based neural encoder-decoder architecture (Luong et al., 2015). This algorithm (a) is well-designed for problems with a large action space and delayed rewards, (b) effectively optimizes traditional corpus-level machine translation metrics, and (c) is robust to skewed, high-variance, granular feedback modeled after actual human behaviors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2017

The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task

We describe the University of Maryland machine translation systems submi...
research
04/16/2018

Can Neural Machine Translation be Improved with User Feedback?

We present the first real-world application of methods for improving neu...
research
07/04/2019

Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation

We propose an interactive-predictive neural machine translation framewor...
research
10/01/2019

Machine Translation for Machines: the Sentiment Classification Use Case

We propose a neural machine translation (NMT) approach that, instead of ...
research
05/27/2018

Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning

We present a study on reinforcement learning (RL) from human bandit feed...
research
05/03/2018

A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation

We present an approach to interactive-predictive neural machine translat...
research
05/27/2021

Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort

In Machine Translation, assessing the quality of a large amount of autom...

Please sign up or login with your details

Forgot password? Click here to reset