SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning

09/22/2020 ∙ by Hwaran Lee, et al. ∙ 0

The recent advent of neural approaches for developing each dialog component in task-oriented dialog systems has greatly improved, yet optimizing the overall system performance remains a challenge. In this paper, we propose an end-to-end trainable neural dialog system with reinforcement learning, named SUMBT+LaRL. The SUMBT+ estimates user-acts as well as dialog belief states, and the LaRL models latent system action spaces and generates response given the estimated contexts. We experimentally demonstrated that the training framework in which the SUMBT+ and LaRL are separately pretrained, then the entire system is fine-tuned significantly increases dialog success rates. We propose new success criteria for reinforcement learning to the end-to-end dialog system as well as provide experimental analysis on a different result aspect depending on the success criteria and evaluation methods. Consequently, our model achieved the new state-of-the-art success rate of 85.4 a comparable success rate of 81.40 the DSTC8 challenge.



There are no comments yet.


page 1

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.