On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?

04/17/2022
by   Nouha Dziri, et al.
7

Knowledge-grounded conversational models are known to suffer from producing factually invalid statements, a phenomenon commonly called hallucination. In this work, we investigate the underlying causes of this phenomenon: is hallucination due to the training data, or to the models? We conduct a comprehensive human study on both existing knowledge-grounded conversational benchmarks and several state-of-the-art models. Our study reveals that the standard benchmarks consist of >60 that not only hallucinate but even amplify hallucinations. Our findings raise important questions on the quality of existing datasets and models trained using them. We make our annotations publicly available for future research.

READ FULL TEXT

page 3

page 4

page 10

page 11

page 12

research
04/15/2021

Retrieval Augmentation Reduces Hallucination in Conversation

Despite showing increasingly human-like conversational abilities, state-...
research
07/22/2022

Knowledge-Grounded Conversational Data Augmentation with Generative Conversational Networks

While rich, open-domain textual data are generally available and may inc...
research
10/24/2020

An Evaluation Protocol for Generative Conversational Systems

There is a multitude of novel generative models for open-domain conversa...
research
02/24/2020

Low-Resource Knowledge-Grounded Dialogue Generation

Responding with knowledge has been recognized as an important capability...
research
02/08/2023

ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots

Conversational AI and Question-Answering systems (QASs) for knowledge gr...
research
12/28/2019

All-in-One Image-Grounded Conversational Agents

As single-task accuracy on individual language and image tasks has impro...
research
05/19/2023

MultiTurnCleanup: A Benchmark for Multi-Turn Spoken Conversational Transcript Cleanup

Current disfluency detection models focus on individual utterances each ...

Please sign up or login with your details

Forgot password? Click here to reset