Moment Matching Training for Neural Machine Translation: A Preliminary Study

12/24/2018
by   Cong Duy Vu Hoang, et al.
0

In previous works, neural sequence models have been shown to improve significantly if external prior knowledge can be provided, for instance by allowing the model to access the embeddings of explicit features during both training and inference. In this work, we propose a different point of view on how to incorporate prior knowledge in a principled way, using a moment matching framework. In this approach, the standard local cross-entropy training of the sequential model is combined with a moment matching training mode that encourages the equality of the expectations of certain predefined features between the model distribution and the empirical distribution. In particular, we show how to derive unbiased estimates of some stochastic gradients that are central to the training, and compare our framework with a formally related one: policy gradient training in reinforcement learning, pointing out some important differences in terms of the kinds of prior assumptions in both approaches. Our initial results are promising, showing the effectiveness of our proposed framework.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2018

Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization

Although neural machine translation has made significant progress recent...
research
09/30/2018

Bayesian Transfer Reinforcement Learning with Prior Knowledge Rules

We propose a probabilistic framework to directly insert prior knowledge ...
research
02/18/2020

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

Reinforcement learning agents usually learn from scratch, which requires...
research
04/30/2020

Language Model Prior for Low-Resource Neural Machine Translation

The scarcity of large parallel corpora is an important obstacle for neur...
research
11/19/2020

Classification by Attention: Scene Graph Classification with Prior Knowledge

A main challenge in scene graph classification is that the appearance of...
research
08/22/2023

Simulation-Based Prior Knowledge Elicitation for Parametric Bayesian Models

A central characteristic of Bayesian statistics is the ability to consis...

Please sign up or login with your details

Forgot password? Click here to reset