SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning

09/22/2020
by   Hwaran Lee, et al.
0

The recent advent of neural approaches for developing each dialog component in task-oriented dialog systems has greatly improved, yet optimizing the overall system performance remains a challenge. In this paper, we propose an end-to-end trainable neural dialog system with reinforcement learning, named SUMBT+LaRL. The SUMBT+ estimates user-acts as well as dialog belief states, and the LaRL models latent system action spaces and generates response given the estimated contexts. We experimentally demonstrated that the training framework in which the SUMBT+ and LaRL are separately pretrained, then the entire system is fine-tuned significantly increases dialog success rates. We propose new success criteria for reinforcement learning to the end-to-end dialog system as well as provide experimental analysis on a different result aspect depending on the success criteria and evaluation methods. Consequently, our model achieved the new state-of-the-art success rate of 85.4 a comparable success rate of 81.40 the DSTC8 challenge.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset