FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

04/22/2022
by   Nouha Dziri, et al.
6

The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on knowledge sources. However, dialogue systems often produce unsupported utterances, a phenomenon known as hallucination. Dziri et al. (2022)'s investigation of hallucinations has revealed that existing knowledge-grounded benchmarks are contaminated with hallucinated responses at an alarming level (>60 trained on this data amplify hallucinations even further (>80 responses). To mitigate this behavior, we adopt a data-centric solution and create FaithDial, a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (WoW) benchmark. We observe that FaithDial is more faithful than WoW while also maintaining engaging conversations. We show that FaithDial can serve as a training signal for: i) a hallucination critic, which discriminates whether an utterance is faithful or not, and boosts the performance by 21.1 F1 score on the BEGIN benchmark compared to existing datasets for dialogue coherence; ii) high-quality dialogue generation. We benchmark a series of state-of-the-art models and propose an auxiliary contrastive objective that achieves the highest level of faithfulness and abstractiveness based on several automated metrics. Further, we find that the benefits of FaithDial generalize to zero-shot transfer on other datasets, such as CMU-Dog and TopicalChat. Finally, human evaluation reveals that responses generated by models trained on FaithDial are perceived as more interpretable, cooperative, and engaging.

READ FULL TEXT

page 4

page 6

page 18

research
04/30/2021

Evaluating Groundedness in Dialogue Systems: The BEGIN Benchmark

Knowledge-grounded dialogue agents are systems designed to conduct a con...
research
12/15/2021

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

Knowledge-grounded dialogue systems are challenging to build due to the ...
research
12/20/2022

Contrastive Learning Reduces Hallucination in Conversations

Pre-trained language models (LMs) store knowledge in their parameters an...
research
08/26/2021

Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts

Dialogue models trained on human conversations inadvertently learn to ge...
research
09/16/2021

Transferable Persona-Grounded Dialogues via Grounded Minimal Edits

Grounded dialogue models generate responses that are grounded on certain...
research
04/28/2022

HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data

A pressing challenge in current dialogue systems is to successfully conv...
research
01/11/2023

Diving Deep into Modes of Fact Hallucinations in Dialogue Systems

Knowledge Graph(KG) grounded conversations often use large pre-trained m...

Please sign up or login with your details

Forgot password? Click here to reset