DeepAI AI Chat
Log In Sign Up

Counterfactual Off-Policy Training for Neural Response Generation

by   Qingfu Zhu, et al.
Harbin Institute of Technology
The Regents of the University of California

Learning a neural response generation model on data synthesized under the adversarial training framework helps to explore more possible responses. However, most of the data synthesized de novo are of low quality due to the vast size of the response space. In this paper, we propose a counterfactual off-policy method to learn on a better synthesis of data. It takes advantage of a real response to infer an alternative that was not taken using a structural casual model. Learning on the counterfactual responses helps to explore the high-reward area of the response space. An empirical study on the DailyDialog dataset shows that our approach significantly outperforms the HRED model as well as the conventional adversarial training approaches.


page 1

page 2

page 3

page 4


Retrieval-Enhanced Adversarial Training for Neural Response Generation

Dialogue systems are usually built on either generation-based or retriev...

Learning from Perturbations: Diverse and Informative Dialogue Generation with Inverse Adversarial Training

In this paper, we propose Inverse Adversarial Training (IAT) algorithm f...

Boosting Naturalness of Language in Task-oriented Dialogues via Adversarial Training

The natural language generation (NLG) module in a task-oriented dialogue...

A novel repetition normalized adversarial reward for headline generation

While reinforcement learning can effectively improve language generation...

Auto Response Generation in Online Medical Chat Services

Telehealth helps to facilitate access to medical professionals by enabli...

Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization

Responses generated by neural conversational models tend to lack informa...

Generate, Filter, and Rank: Grammaticality Classification for Production-Ready NLG Systems

Neural approaches to Natural Language Generation (NLG) have been promisi...