Out-of-Distribution Detection and Selective Generation for Conditional Language Models

09/30/2022
by   Jie Ren, et al.
0

Machine learning algorithms typically assume independent and identically distributed samples in training and at test time. Much work has shown that high-performing ML classifiers can degrade significantly and provide overly-confident, wrong classification predictions, particularly for out-of-distribution (OOD) inputs. Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence, and may suffer even worse degradation on OOD inputs as the prediction is done auto-regressively over many steps. Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output. We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation. We also show how our method can be used under the common and realistic setting of distribution shift for selective generation (analogous to selective prediction for classification) of high-quality outputs, while automatically abstaining from low-quality ones, enabling safer deployment of generative language models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2023

Learning Evaluation Models from Large Language Models for Sequence Generation

Large language models achieve state-of-the-art performance on sequence g...
research
09/17/2022

Selective Token Generation for Few-shot Natural Language Generation

Natural language modeling with limited training data is a challenging pr...
research
11/03/2020

Generating Unobserved Alternatives

We consider problems where multiple predictions can be considered correc...
research
01/23/2018

MaskGAN: Better Text Generation via Filling in the ______

Neural text generation models are often autoregressive language models o...
research
03/08/2023

Automatically Auditing Large Language Models via Discrete Optimization

Auditing large language models for unexpected behaviors is critical to p...
research
11/05/2020

Detecting Hallucinated Content in Conditional Neural Sequence Generation

Neural sequence models can generate highly fluent sentences but recent s...
research
12/09/2021

Spinning Language Models for Propaganda-As-A-Service

We investigate a new threat to neural sequence-to-sequence (seq2seq) mod...

Please sign up or login with your details

Forgot password? Click here to reset