Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML

05/19/2017
by   Xuezhe Ma, et al.
0

Reward augmented maximum likelihood (RAML), a simple and effective learning framework to directly optimize towards the reward function in structured prediction tasks, has led to a number of impressive empirical successes. RAML incorporates task-specific reward by performing maximum-likelihood updates on candidate outputs sampled according to an exponentiated payoff distribution, which gives higher probabilities to candidates that are close to the reference output. While RAML is notable for its simplicity, efficiency, and its impressive empirical successes, the theoretical properties of RAML, especially the behavior of the exponentiated payoff distribution, has not been examined thoroughly. In this work, we introduce softmax Q-distribution estimation, a novel theoretical interpretation of RAML, which reveals the relation between RAML and Bayesian decision theory. The softmax Q-distribution can be regarded as a smooth approximation of the Bayes decision boundary, and the Bayes decision rule is achieved by decoding with this Q-distribution. We further show that RAML is equivalent to approximately estimating the softmax Q-distribution, with the temperature τ controlling approximation error. We perform two experiments, one on synthetic data of multi-class classification and one on real data of image captioning, to demonstrate the relationship between RAML and the proposed softmax Q-distribution estimation method, verifying our theoretical analysis. Additional experiments on three structured prediction tasks with rewards defined on sequential (named entity recognition), tree-based (dependency parsing) and irregular (machine translation) structures show notable improvements over maximum likelihood baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2015

Parameter estimation in softmax decision-making models with linear objective functions

With an eye towards human-centered automation, we contribute to the deve...
research
03/15/2020

Analysis of Softmax Approximation for Deep Classifiers under Input-Dependent Label Noise

Modelling uncertainty arising from input-dependent label noise is an inc...
research
10/03/2018

Inhibited Softmax for Uncertainty Estimation in Neural Networks

We present a new method for uncertainty estimation and out-of-distributi...
research
12/28/2020

Neural Text Generation with Artificial Negative Examples

Neural text generation models conditioning on given input (e.g. machine ...
research
06/23/2023

Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation

Image captioning aims to describe visual content in natural language. As...
research
06/08/2018

Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations

Many structured prediction problems (particularly in vision and language...
research
09/29/2022

GROOT: Corrective Reward Optimization for Generative Sequential Labeling

Sequential labeling is a fundamental NLP task, forming the backbone of m...

Please sign up or login with your details

Forgot password? Click here to reset