Learning Dialog Policies from Weak Demonstrations

04/23/2020
by   Gabriel Gordon-Hall, et al.
0

Deep reinforcement learning is a promising approach to training a dialog manager, but current methods struggle with the large state and action spaces of multi-domain dialog systems. Building upon Deep Q-learning from Demonstrations (DQfD), an algorithm that scores highly in difficult Atari games, we leverage dialog data to guide the agent to successfully respond to a user's requests. We make progressively fewer assumptions about the data needed, using labeled, reduced-labeled, and even unlabeled data to train expert demonstrators. We introduce Reinforced Fine-tune Learning, an extension to DQfD, enabling us to overcome the domain gap between the datasets and the environment. Experiments in a challenging multi-domain dialog system framework validate our approaches, and get high success rates even when trained on out-of-domain data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2020

Show Us the Way: Learning to Manage Dialog from Demonstrations

We present our submission to the End-to-End Multi-Domain Dialog Challeng...
research
07/01/2022

Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings

Learning task-oriented dialog policies via reinforcement learning typica...
research
08/02/2017

Deep Reinforcement Learning for Inquiry Dialog Policies with Logical Formula Embeddings

This paper is the first attempt to learn the policy of an inquiry dialog...
research
07/13/2023

Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative

Dialog policies, which determine a system's action based on the current ...
research
04/25/2022

"Think Before You Speak": Improving Multi-Action Dialog Policy by Planning Single-Action Dialogs

Multi-action dialog policy (MADP), which generates multiple atomic dialo...
research
06/23/2015

Multi-domain Dialog State Tracking using Recurrent Neural Networks

Dialog state tracking is a key component of many modern dialog systems, ...
research
12/11/2017

Learning Robust Dialog Policies in Noisy Environments

Modern virtual personal assistants provide a convenient interface for co...

Please sign up or login with your details

Forgot password? Click here to reset