"Think Before You Speak": Improving Multi-Action Dialog Policy by Planning Single-Action Dialogs

04/25/2022
by   Shuo Zhang, et al.
0

Multi-action dialog policy (MADP), which generates multiple atomic dialog actions per turn, has been widely applied in task-oriented dialog systems to provide expressive and efficient system responses. Existing MADP models usually imitate action combinations from the labeled multi-action dialog samples. Due to data limitations, they generalize poorly toward unseen dialog flows. While interactive learning and reinforcement learning algorithms can be applied to incorporate external data sources of real users and user simulators, they take significant manual effort to build and suffer from instability. To address these issues, we propose Planning Enhanced Dialog Policy (PEDP), a novel multi-task learning framework that learns single-action dialog dynamics to enhance multi-action prediction. Our PEDP method employs model-based planning for conceiving what to express before deciding the current response through simulating single-action dialogs. Experimental results on the MultiWOZ dataset demonstrate that our fully supervised learning-based method achieves a solid task success rate of 90.6 methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

Multi-Action Dialog Policy Learning from Logged User Feedback

Multi-action dialog policy, which generates multiple atomic dialog actio...
research
05/04/2023

An Asynchronous Updating Reinforcement Learning Framework for Task-oriented Dialog System

Reinforcement learning has been applied to train the dialog systems in m...
research
04/07/2023

Gated Mechanism Enhanced Multi-Task Learning for Dialog Routing

Currently, human-bot symbiosis dialog systems, e.g., pre- and after-sale...
research
11/24/2019

Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context

Conversations have an intrinsic one-to-many property, which means that m...
research
10/31/2019

Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning

Task-oriented dialog presents a difficult challenge encompassing multipl...
research
04/23/2020

Learning Dialog Policies from Weak Demonstrations

Deep reinforcement learning is a promising approach to training a dialog...
research
02/23/2019

Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models

Defining action spaces for conversational agents and optimizing their de...

Please sign up or login with your details

Forgot password? Click here to reset