Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

07/12/2020
by   Yuxuan Song, et al.
4

Auto-regressive sequence generative models trained by Maximum Likelihood Estimation suffer the exposure bias problem in practical finite sample scenarios. The crux is that the number of training samples for Maximum Likelihood Estimation is usually limited and the input data distributions are different at training and inference stages. Many method shave been proposed to solve the above problem (Yu et al., 2017; Lu et al., 2018), which relies on sampling from the non-stationary model distribution and suffers from high variance or biased estimations. In this paper, we proposeψ-MLE, a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation. We derive our algorithm from a new perspective of self-augmentation and introduce bias correction with density ratio estimation. Extensive experimental results on synthetic data and real-world text generation tasks demonstrate that our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.

READ FULL TEXT
research
09/29/2022

Maximum likelihood estimation of the Weibull distribution with reduced bias

In this short note we derive a new bias-adjusted maximum likelihood esti...
research
04/11/2018

CoT: Cooperative Training for Generative Modeling

We propose Cooperative Training (CoT) for training generative models tha...
research
10/19/2022

Autoregressive Generative Modeling with Noise Conditional Maximum Likelihood Estimation

We introduce a simple modification to the standard maximum likelihood es...
research
01/02/2021

SDA: Improving Text Generation with Self Data Augmentation

Data augmentation has been widely used to improve deep neural networks i...
research
02/26/2023

Tailoring Language Generation Models under Total Variation Distance

The standard paradigm of neural language generation adopts maximum likel...
research
08/07/2023

Generative Forests

Tabular data represents one of the most prevalent form of data. When it ...
research
05/08/2023

CURTAINs Flows For Flows: Constructing Unobserved Regions with Maximum Likelihood Estimation

Model independent techniques for constructing background data templates ...

Please sign up or login with your details

Forgot password? Click here to reset