Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes

07/05/2019
by   Chinnadhurai Sankar, et al.
0

Open domain dialog systems face the challenge of being repetitive and producing generic responses. In this paper, we demonstrate that by conditioning the response generation on interpretable discrete dialog attributes and composed attributes, it helps improve the model perplexity and results in diverse and interesting non-redundant responses. We propose to formulate the dialog attribute prediction as a reinforcement learning (RL) problem and use policy gradients methods to optimize utterance generation using long-term rewards. Unlike existing RL approaches which formulate the token prediction as a policy, our method reduces the complexity of the policy optimization by limiting the action space to dialog attributes, thereby making the policy optimization more practical and sample efficient. We demonstrate this with experimental and human evaluations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2019

Hierarchical Reinforcement Learning for Open-Domain Dialog

Open-domain dialog generation is a challenging problem; maximum likeliho...
research
09/06/2019

Building Task-Oriented Visual Dialog Systems Through Alternative Optimization Between Dialog Policy and Language Generation

Reinforcement learning (RL) is an effective approach to learn an optimal...
research
09/05/2023

Dialog Action-Aware Transformer for Dialog Policy Learning

Recent works usually address Dialog policy learning DPL by training a re...
research
04/30/2017

A Conditional Variational Framework for Dialog Generation

Deep latent variable models have been shown to facilitate the response g...
research
04/10/2021

Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management

For task-oriented dialog systems, training a Reinforcement Learning (RL)...
research
08/28/2018

Why Do Neural Response Generation Models Prefer Universal Replies?

Recent advances in sequence-to-sequence learning reveal a purely data-dr...
research
02/23/2019

Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models

Defining action spaces for conversational agents and optimizing their de...

Please sign up or login with your details

Forgot password? Click here to reset