Retrieval Augmentation Reduces Hallucination in Conversation

by   Kurt Shuster, et al.

Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020). In this work we explore the use of neural-retrieval-in-the-loop architectures - recently shown to be effective in open-domain QA (Lewis et al., 2020b; Izacard and Grave, 2020) - for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses. We study various types of architectures with multiple components - retrievers, rankers, and encoder-decoders - with the goal of maximizing knowledgeability while retaining conversational ability. We demonstrate that our best models obtain state-of-the-art performance on two knowledge-grounded conversational tasks. The models exhibit open-domain conversational capabilities, generalize effectively to scenarios not within the training data, and, as verified by human evaluations, substantially reduce the well-known problem of knowledge hallucination in state-of-the-art chatbots.


Language Models that Seek for Knowledge: Modular Search Generation for Dialogue and Prompt Completion

Language models (LMs) have recently been shown to generate more factual ...

Multi-Modal Open-Domain Dialogue

Recent work in open-domain conversational agents has demonstrated that s...

Engaging Image Chat: Modeling Personality in Grounded Dialogue

To achieve the long-term goal of machines being able to engage humans in...

A Dataset for Sentence Retrieval for Open-Ended Dialogues

We address the task of sentence retrieval for open-ended dialogues. The ...

On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?

Knowledge-grounded conversational models are known to suffer from produc...

The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

We introduce dodecaDialogue: a set of 12 tasks that measures if a conver...

Reason first, then respond: Modular Generation for Knowledge-infused Dialogue

Large language models can produce fluent dialogue but often hallucinate ...