Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration

12/31/2020
by   Weiyan Shi, et al.
7

Despite the recent success of large-scale language models on various downstream NLP tasks, the repetition and inconsistency problems still persist in dialogue response generation. Previous approaches have attempted to avoid repetition by penalizing the language model's undesirable behaviors in the loss function. However, these methods focus on token-level information and can lead to incoherent responses and uninterpretable behaviors. To alleviate these issues, we propose to apply reinforcement learning to refine an MLE-based language model without user simulators, and distill sentence-level information about repetition, inconsistency and task relevance through rewards. In addition, to better accomplish the dialogue task, the model learns from human demonstration to imitate intellectual activities such as persuasion, and selects the most persuasive responses. Experiments show that our model outperforms previous state-of-the-art dialogue models on both automatic metrics and human evaluation results on a donation persuasion task, and generates more diverse, consistent and persuasive conversations according to the user feedback.

READ FULL TEXT

page 3

page 7

page 13

page 14

research
06/16/2023

Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulator to Enhance Dialogue System

Dialogue systems and large language models (LLMs) have gained considerab...
research
04/18/2022

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

Conventionally, generation of natural language for dialogue agents may b...
research
09/12/2023

Leveraging Large Language Models for Automated Dialogue Analysis

Developing high-performing dialogue systems benefits from the automatic ...
research
05/18/2023

SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation

Language models trained on large-scale corpora can generate remarkably f...
research
03/02/2020

Learning from Easy to Complex: Adaptive Multi-curricula Learning for Neural Dialogue Generation

Current state-of-the-art neural dialogue systems are mainly data-driven ...
research
10/08/2021

CheerBots: Chatbots toward Empathy and Emotionusing Reinforcement Learning

Apart from the coherence and fluency of responses, an empathetic chatbot...
research
07/12/2021

Modeling Explicit Concerning States for Reinforcement Learning in Visual Dialogue

To encourage AI agents to conduct meaningful Visual Dialogue (VD), the u...

Please sign up or login with your details

Forgot password? Click here to reset