DeepAI AI Chat
Log In Sign Up

Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation

by   Samuel Kiegeland, et al.

Policy gradient algorithms have found wide adoption in NLP, but have recently become subject to criticism, doubting their suitability for NMT. Choshen et al. (2020) identify multiple weaknesses and suspect that their success is determined by the shape of output distributions rather than the reward. In this paper, we revisit these claims and study them under a wider range of configurations. Our experiments on in-domain and cross-domain adaptation reveal the importance of exploration and reward scaling, and provide empirical counter-evidence to these claims.


page 1

page 2

page 3

page 4


Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey

The development of deep learning techniques has allowed Neural Machine T...

Efficient Machine Translation Domain Adaptation

Machine translation models struggle when translating out-of-domain text,...

Iterative Dual Domain Adaptation for Neural Machine Translation

Previous studies on the domain adaptation for neural machine translation...

Fast Domain Adaptation for Neural Machine Translation

Neural Machine Translation (NMT) is a new approach for automatic transla...

What is the Essence of a Claim? Cross-Domain Claim Identification

Argument mining has become a popular research area in NLP. It typically ...

Learning Kernel-Smoothed Machine Translation with Retrieved Examples

How to effectively adapt neural machine translation (NMT) models accordi...

Variational Intrinsic Control Revisited

In this paper, we revisit variational intrinsic control (VIC), an unsupe...