A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems

10/17/2022
by   Hong Liu, et al.
0

Building user simulators (USs) for reinforcement learning (RL) of task-oriented dialog systems (DSs) has gained more and more attention, which, however, still faces several fundamental challenges. First, it is unclear whether we can leverage pretrained language models to design, for example, GPT-2 based USs, to catch up and interact with the recently advanced GPT-2 based DSs. Second, an important ingredient in a US is that the user goal can be effectively incorporated and tracked; but how to flexibly integrate goal state tracking and develop an end-to-end trainable US for multi-domains has remained to be a challenge. In this work, we propose a generative user simulator (GUS) with GPT-2 based architecture and goal state tracking towards addressing the above two challenges. Extensive experiments are conducted on MultiWOZ2.1. Different DSs are trained via RL with GUS, the classic agenda-based user simulator (ABUS) and other ablation simulators respectively, and are compared for cross-model evaluation, corpus-based evaluation and human evaluation. The GUS achieves superior results in all three evaluation tasks.

READ FULL TEXT
research
10/13/2022

Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture

Recently, there has been progress in supervised funetuning pretrained GP...
research
09/03/2019

How to Build User Simulators to Train RL-based Dialog Systems

User simulators are essential for training reinforcement learning (RL) b...
research
05/30/2018

Adversarial Learning of Task-Oriented Neural Dialog Models

In this work, we propose an adversarial learning method for reward estim...
research
04/02/2023

An End-to-End Human Simulator for Task-Oriented Multimodal Human-Robot Collaboration

This paper proposes a neural network-based user simulator that can provi...
research
04/08/2020

Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition

Many studies have applied reinforcement learning to train a dialog polic...
research
10/26/2022

Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with User Simulator

Task-Oriented Dialogue (TOD) systems are drawing more and more attention...
research
04/24/2023

Development of a Trust-Aware User Simulator for Statistical Proactive Dialog Modeling in Human-AI Teams

The concept of a Human-AI team has gained increasing attention in recent...

Please sign up or login with your details

Forgot password? Click here to reset