Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models

05/30/2023
by   Zhen Lin, et al.
0

Large language models (LLMs) specializing in natural language generation (NLG) have recently started exhibiting promising capabilities across a variety of domains. However, gauging the trustworthiness of responses generated by LLMs remains an open challenge, with limited research on uncertainty quantification for NLG. Furthermore, existing literature typically assumes white-box access to language models, which is becoming unrealistic either due to the closed-source nature of the latest LLMs or due to computational constraints. In this work, we investigate uncertainty quantification in NLG for black-box LLMs. We first differentiate two closely-related notions: uncertainty, which depends only on the input, and confidence, which additionally depends on the generated response. We then propose and compare several confidence/uncertainty metrics, applying them to selective NLG, where unreliable results could either be ignored or yielded for further assessment. Our findings on several popular LLMs and datasets reveal that a simple yet effective metric for the average semantic dispersion can be a reliable predictor of the quality of LLM responses. This study can provide valuable insights for practitioners on uncertainty management when adopting LLMs. The code to replicate all our experiments is available at https://github.com/zlin7/UQ-NLG.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2023

Quantifying Uncertainty in Answers from any Language Model via Intrinsic and Extrinsic Confidence Assessment

We introduce BSDetector, a method for detecting bad and speculative answ...
research
05/29/2023

Beyond Confidence: Reliable Models Should Also Consider Atypicality

While most machine learning models can provide confidence in their predi...
research
01/31/2019

Bayesian active learning for optimization and uncertainty quantification in protein docking

Motivation: Ab initio protein docking represents a major challenge for o...
research
07/07/2023

URL: A Representation Learning Benchmark for Transferable Uncertainty Estimates

Representation learning has significantly driven the field to develop pr...
research
09/07/2023

Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty

Open Information Extraction (OIE) task aims at extracting structured fac...
research
03/15/2023

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Generative Large Language Models (LLMs) such as GPT-3 are capable of gen...
research
05/28/2020

Uncertainty Evaluation Metric for Brain Tumour Segmentation

In this paper, we develop a metric designed to assess and rank uncertain...

Please sign up or login with your details

Forgot password? Click here to reset