-
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Building a dialogue agent to fulfill complex tasks, such as travel plann...
read it
-
AgentGraph: Towards Universal Dialogue Management with Structured Deep Reinforcement Learning
Dialogue policy plays an important role in task-oriented spoken dialogue...
read it
-
Hierarchical BiGraph Neural Network as Recommendation Systems
Graph neural networks emerge as a promising modeling method for applicat...
read it
-
Subgoal Discovery for Hierarchical Dialogue Policy Learning
Developing conversational agents to engage in complex dialogues is chall...
read it
-
On the Transferability of Representations in Neural Networks Between Datasets and Tasks
Deep networks, composed of multiple layers of hierarchical distributed r...
read it
-
Off-policy Multi-step Q-learning
In the past few years, off-policy reinforcement learning methods have sh...
read it
-
Decentralized Control with Graph Neural Networks
Dynamical systems consisting of a set of autonomous agents face the chal...
read it
Structured Hierarchical Dialogue Policy with Graph Neural Networks
Dialogue policy training for composite tasks, such as restaurant reservation in multiple places, is a practically important and challenging problem. Recently, hierarchical deep reinforcement learning (HDRL) methods have achieved good performance in composite tasks. However, in vanilla HDRL, both top-level and low-level policies are all represented by multi-layer perceptrons (MLPs) which take the concatenation of all observations from the environment as the input for predicting actions. Thus, traditional HDRL approach often suffers from low sampling efficiency and poor transferability. In this paper, we address these problems by utilizing the flexibility of graph neural networks (GNNs). A novel ComNet is proposed to model the structure of a hierarchical agent. The performance of ComNet is tested on composited tasks of the PyDial benchmark. Experiments show that ComNet outperforms vanilla HDRL systems with performance close to the upper bound. It not only achieves sample efficiency but also is more robust to noise while maintaining the transferability to other composite tasks.
READ FULL TEXT
Comments
There are no comments yet.