Elastic Weight Removal for Faithful and Abstractive Dialogue Generation

03/30/2023
by   Nico Daheim, et al.
0

Ideally, dialogue systems should generate responses that are faithful to the knowledge contained in relevant documents. However, many models generate hallucinated responses instead that contradict it or contain unverifiable information. To mitigate such undesirable behaviour, it has been proposed to fine-tune a `negative expert' on negative examples and subtract its parameters from those of a pre-trained model. However, intuitively, this does not take into account that some parameters are more responsible than others in causing hallucinations. Thus, we propose to weigh their individual importance via (an approximation of) the Fisher Information matrix, which measures the uncertainty of their estimate. We call this method Elastic Weight Removal (EWR). We evaluate our method – using different variants of Flan-T5 as a backbone language model – on multiple datasets for information-seeking dialogue generation and compare our method with state-of-the-art techniques for faithfulness, such as CTRL, Quark, DExperts, and Noisy Channel reranking. Extensive automatic and human evaluation shows that EWR systematically increases faithfulness at minor costs in terms of other metrics. However, we notice that only discouraging hallucinations may increase extractiveness, i.e. shallow copy-pasting of document spans, which can be undesirable. Hence, as a second main contribution, we show that our method can be extended to simultaneously discourage hallucinations and extractive responses. We publicly release the code for reproducing EWR and all baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2023

Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues

In this paper, we investigate the use of large language models (LLMs) li...
research
09/06/2023

Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses

Recently, utilizing deep neural networks to build the opendomain dialogu...
research
12/22/2020

Learning to Retrieve Entity-Aware Knowledge and Generate Responses with Copy Mechanism for Task-Oriented Dialogue Systems

Task-oriented conversational modeling with unstructured knowledge access...
research
07/14/2021

Increasing Faithfulness in Knowledge-Grounded Dialogue with Controllable Features

Knowledge-grounded dialogue systems are intended to convey information t...
research
10/22/2022

Transformer-Based Conditioned Variational Autoencoder for Dialogue Generation

In human dialogue, a single query may elicit numerous appropriate respon...
research
08/28/2020

The Adapter-Bot: All-In-One Controllable Conversational Model

Considerable progress has been made towards conversational models that g...

Please sign up or login with your details

Forgot password? Click here to reset