Conformal Nucleus Sampling

05/04/2023
by   Shauli Ravfogel, et al.
9

Language models generate text based on successively sampling the next word. A decoding procedure based on nucleus (top-p) sampling chooses from the smallest possible set of words whose cumulative probability exceeds the probability p. In this work, we assess whether a top-p set is indeed aligned with its probabilistic meaning in various linguistic contexts. We employ conformal prediction, a calibration procedure that focuses on the construction of minimal prediction sets according to a desired confidence level, to calibrate the parameter p as a function of the entropy of the next word distribution. We find that OPT models are overconfident, and that calibration shows a moderate inverse scaling with model size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2019

Calibration, Entropy Rates, and Memory in Language Models

Building accurate language models that capture meaningful long-term depe...
research
11/20/2016

Visualizing Linguistic Shift

Neural network based models are a very powerful tool for creating word e...
research
11/21/2022

AdaFocal: Calibration-aware Adaptive Focal Loss

Much recent work has been devoted to the problem of ensuring that a neur...
research
02/01/2022

Typical Decoding for Natural Language Generation

Despite achieving incredibly low perplexities on myriad natural language...
research
10/27/2022

Truncation Sampling as Language Model Desmoothing

Long samples of text from neural language models can be of poor quality....
research
07/07/2023

On the Efficacy of Sampling Adapters

Sampling is a common strategy for generating text from probabilistic mod...
research
07/19/2023

Non-parametric inference on calibration of predicted risks

Moderate calibration, the expected event probability among observations ...

Please sign up or login with your details

Forgot password? Click here to reset