Towards Diverse Text Generation with Inverse Reinforcement Learning

04/30/2018
by   Zhan Shi, et al.
0

Text generation is a crucial task in NLP. Recently, several adversarial generative models have been proposed to improve the exposure bias problem in text generation. Though these models gain great success, they still suffer from the problems of reward sparsity and mode collapse. In order to address these two problems, in this paper, we employ inverse reinforcement learning (IRL) for text generation. Specifically, the IRL framework learns a reward function on training data, and then an optimal policy to maximum the expected total reward. Similar to the adversarial models, the reward and policy function in IRL are optimized alternately. Our method has two advantages: (1) the reward function can produce more dense reward signals. (2) the generation policy, trained by "entropy regularized" policy gradient, encourages to generate more diversified texts. Experiment results demonstrate that our proposed method can generate higher quality texts than the previous methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2022

Teacher Forcing Recovers Reward Functions for Text Generation

Reinforcement learning (RL) has been widely used in text generation to a...
research
11/24/2018

Connecting the Dots Between MLE and RL for Sequence Generation

Sequence generation models such as recurrent networks can be trained wit...
research
02/08/2021

Generate and Revise: Reinforcement Learning in Neural Poetry

Writers, poets, singers usually do not create their compositions in just...
research
03/18/2022

Personalized filled-pause generation with group-wise prediction models

In this paper, we propose a method to generate personalized filled pause...
research
09/16/2020

Text Generation by Learning from Off-Policy Demonstrations

Current approaches to text generation largely rely on autoregressive mod...
research
09/17/2021

Relating Neural Text Degeneration to Exposure Bias

This work focuses on relating two mysteries in neural-based text generat...
research
05/10/2018

Discourse-Aware Neural Rewards for Coherent Text Generation

In this paper, we investigate the use of discourse-aware rewards with re...

Please sign up or login with your details

Forgot password? Click here to reset