A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation

07/08/2023
by   Neeraj Varshney, et al.
0

Recently developed large language models have achieved remarkable success in generating fluent and coherent text. However, these models often tend to 'hallucinate' which critically hampers their reliability. In this work, we address this crucial problem and propose an approach that actively detects and mitigates hallucinations during the generation process. Specifically, we first identify the candidates of potential hallucination leveraging the model's logit output values, check their correctness through a validation procedure, mitigate the detected hallucinations, and then continue with the generation process. Through extensive experiments with the 'article generation task', we first demonstrate the individual efficacy of our detection and mitigation techniques. Specifically, the detection technique achieves a recall of 88 mitigation technique successfully mitigates 57.6 hallucinations. Importantly, our mitigation technique does not introduce new hallucinations even in the case of incorrectly detected hallucinations, i.e., false positives. Then, we show that the proposed active detection and mitigation approach successfully reduces the hallucinations of the GPT-3 model from 47.5 the reliability and trustworthiness of large language models, a crucial step en route to enabling their widespread adoption in real-world applications.

READ FULL TEXT

page 2

page 5

page 6

page 14

research
05/25/2023

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

Large language models (large LMs) are susceptible to producing text with...
research
09/03/2023

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

While large language models (LLMs) have demonstrated remarkable capabili...
research
07/01/2023

Understanding Counterspeech for Online Harm Mitigation

Counterspeech offers direct rebuttals to hateful speech by challenging p...
research
09/06/2023

Zero-Resource Hallucination Prevention for Large Language Models

The prevalent use of large language models (LLMs) in various domains has...
research
09/20/2023

Exploring the Relationship between LLM Hallucinations and Prompt Linguistic Nuances: Readability, Formality, and Concreteness

As Large Language Models (LLMs) have advanced, they have brought forth n...
research
07/30/2023

A Private Watermark for Large Language Models

Recently, text watermarking algorithms for large language models (LLMs) ...
research
07/19/2023

Code Detection for Hardware Acceleration Using Large Language Models

Large language models (LLMs) have been massively applied to many tasks, ...

Please sign up or login with your details

Forgot password? Click here to reset