Adaptive Dialog Policy Learning with Hindsight and User Modeling

05/07/2020
by   Yan Cao, et al.
0

Reinforcement learning methods have been used to compute dialog policies from language-based interaction experiences. Efficiency is of particular importance in dialog policy learning, because of the considerable cost of interacting with people, and the very poor user experience from low-quality conversations. Aiming at improving the efficiency of dialog policy learning, we develop algorithm LHUA (Learning with Hindsight, User modeling, and Adaptation) that, for the first time, enables dialog agents to adaptively learn with hindsight from both simulated and real users. Simulation and hindsight provide the dialog agent with more experience and more (positive) reinforcements respectively. Experimental results suggest that, in success rate and policy quality, LHUA outperforms competitive baselines from the literature, including its no-simulation, no-adaptation, and no-hindsight counterparts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2020

Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition

Many studies have applied reinforcement learning to train a dialog polic...
research
04/21/2023

Which Factors Predict the Chat Experience of a Natural Language Generation Dialogue Service?

In this paper, we proposed a conceptual model to predict the chat experi...
research
04/07/2020

Guided Dialog Policy Learning without Adversarial Learning in the Loop

Reinforcement-based training methods have emerged as the most popular ch...
research
11/25/2022

Towards Improving Proactive Dialog Agents Using Socially-Aware Reinforcement Learning

The next step for intelligent dialog agents is to escape their role as s...
research
09/03/2019

CMU GetGoing: An Understandable and Memorable Dialog System for Seniors

Voice-based technologies are typically developed for the average user, a...
research
01/18/2017

Assessing User Expertise in Spoken Dialog System Interactions

Identifying the level of expertise of its users is important for a syste...
research
10/02/2018

Efficient Dialog Policy Learning via Positive Memory Retention

This paper is concerned with the training of recurrent neural networks a...

Please sign up or login with your details

Forgot password? Click here to reset