SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

03/15/2023
by   Potsawee Manakul, et al.
0

Generative Large Language Models (LLMs) such as GPT-3 are capable of generating highly fluent responses to a wide variety of user prompts. However, LLMs are known to hallucinate facts and make non-factual statements which can undermine trust in their output. Existing fact-checking approaches either require access to token-level output probability distribution (which may not be available for systems such as ChatGPT) or external databases that are interfaced via separate, often complex, modules. In this work, we propose "SelfCheckGPT", a simple sampling-based approach that can be used to fact-check black-box models in a zero-resource fashion, i.e. without an external database. SelfCheckGPT leverages the simple idea that if a LLM has knowledge of a given concept, sampled responses are likely to be similar and contain consistent facts. However, for hallucinated facts, stochastically sampled responses are likely to diverge and contradict one another. We investigate this approach by using GPT-3 to generate passages about individuals from the WikiBio dataset, and manually annotate the factuality of the generated passages. We demonstrate that SelfCheckGPT can: i) detect non-factual and factual sentences; and ii) rank passages in terms of factuality. We compare our approach to several existing baselines and show that in sentence hallucination detection, our approach has AUC-PR scores comparable to grey-box methods, while SelfCheckGPT is best at passage factuality assessment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

Large language models (LLMs), such as ChatGPT, are able to generate huma...
research
05/30/2023

Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models

Large language models (LLMs) specializing in natural language generation...
research
06/14/2023

Knowledge Distillation of Large Language Models

Knowledge Distillation (KD) is a promising technique for reducing the hi...
research
09/07/2023

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Despite their impressive capabilities, large language models (LLMs) are ...
research
07/13/2023

Generating Benchmarks for Factuality Evaluation of Language Models

Before deploying a language model (LM) within a given domain, it is impo...
research
05/25/2023

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

Large language models (large LMs) are susceptible to producing text with...
research
06/06/2023

I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models

Since the release of OpenAI's ChatGPT, generative language models have a...

Please sign up or login with your details

Forgot password? Click here to reset