Building Task-Oriented Visual Dialog Systems Through Alternative Optimization Between Dialog Policy and Language Generation

09/06/2019
by   Mingyang Zhou, et al.
0

Reinforcement learning (RL) is an effective approach to learn an optimal dialog policy for task-oriented visual dialog systems. A common practice is to apply RL on a neural sequence-to-sequence (seq2seq) framework with the action space being the output vocabulary in the decoder. However, it is difficult to design a reward function that can achieve a balance between learning an effective policy and generating a natural dialog response. This paper proposes a novel framework that alternatively trains a RL policy for image guessing and a supervised seq2seq model to improve dialog generation quality. We evaluate our framework on the GuessWhich task and the framework achieves the state-of-the-art performance in both task completion and dialog quality.

READ FULL TEXT

page 6

page 11

research
08/28/2019

Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog

Dialog policy decides what and how a task-oriented dialog system will re...
research
05/08/2018

Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog

Creating an intelligent conversational system that understands vision an...
research
04/16/2020

Paraphrase Augmented Task-Oriented Dialog Generation

Neural generative models have achieved promising performance on dialog g...
research
07/05/2019

Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes

Open domain dialog systems face the challenge of being repetitive and pr...
research
12/07/2017

End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient

Learning a goal-oriented dialog policy is generally performed offline wi...
research
04/07/2019

Unsupervised Dialog Structure Learning

Learning a shared dialog structure from a set of task-oriented dialogs i...
research
05/04/2023

An Asynchronous Updating Reinforcement Learning Framework for Task-oriented Dialog System

Reinforcement learning has been applied to train the dialog systems in m...

Please sign up or login with your details

Forgot password? Click here to reset