Towards Boosting the Open-Domain Chatbot with Human Feedback

08/30/2022
by   Hua Lu, et al.
0

Many open-domain dialogue models pre-trained with social media comments can generate coherent replies but have difficulties producing engaging responses when interacting with real users. This phenomenon might mainly result from the deficiency of annotated human-human conversations and the misalignment with human preference. In this paper, we propose a novel and efficient approach Diamante to boost the open-domain chatbot, where two kinds of human feedback (including explicit demonstration and implicit preference) are collected and leveraged. By asking annotators to select or amend the model-generated candidate responses, Diamante efficiently collects the human demonstrated responses and constructs a Chinese chit-chat dataset. To enhance the alignment with human preference, Diamante leverages the implicit preference in the data collection process and introduces the generation-evaluation joint training. Comprehensive experiments indicate that the Diamante dataset and joint training paradigm can significantly boost the performance of Chinese pre-trained dialogue models.

READ FULL TEXT

page 8

page 14

research
09/06/2023

Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses

Recently, utilizing deep neural networks to build the opendomain dialogu...
research
03/17/2022

EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

Large-scale pre-training has shown remarkable performance in building op...
research
01/20/2021

WeChat AI's Submission for DSTC9 Interactive Dialogue Evaluation Track

We participate in the DSTC9 Interactive Dialogue Evaluation Track (Gunas...
research
02/01/2020

Dialogue-based simulation for cultural awareness training

Existing simulations designed for cultural and interpersonal skill train...
research
09/15/2020

Dialogue Response Ranking Training with Large-Scale Human Feedback Data

Existing open-domain dialog models are generally trained to minimize the...
research
06/30/2023

Preference Ranking Optimization for Human Alignment

Large language models (LLMs) often contain misleading content, emphasizi...
research
11/03/2021

Automatic Evaluation and Moderation of Open-domain Dialogue Systems

The development of Open-Domain Dialogue Systems (ODS)is a trending topic...

Please sign up or login with your details

Forgot password? Click here to reset