An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels

03/21/2022
by   Taylor Sorensen, et al.
0

Pre-trained language models derive substantial linguistic and factual knowledge from the massive corpora on which they are trained, and prompt engineering seeks to align these models to specific tasks. Unfortunately, existing prompt engineering methods require significant amounts of labeled data, access to model parameters, or both. We introduce a new method for selecting prompt templates without labeled examples and without direct access to the model. Specifically, over a set of candidate templates, we choose the template that maximizes the mutual information between the input and the corresponding model output. Across 8 datasets representing 7 distinct NLP tasks, we show that when a template has high mutual information, it also has high accuracy on the task. On the largest model, selecting prompts with our method gets 90% of the way from the average prompt accuracy to the best prompt accuracy and requires no ground truth labels.

READ FULL TEXT

page 16

page 17

page 18

page 22

page 23

page 25

page 26

page 36

research
02/24/2018

Water from Two Rocks: Maximizing the Mutual Information

Our goal is to forecast ground truth Y using two sources of information ...
research
10/05/2020

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Large-scale language models such as BERT have achieved state-of-the-art ...
research
07/29/2019

Improved mutual information measure for classification and community detection

The information theoretic quantity known as mutual information finds wid...
research
09/09/2021

MetaXT: Meta Cross-Task Transfer between Disparate Label Spaces

Albeit the universal representational power of pre-trained language mode...
research
10/26/2020

Probing Task-Oriented Dialogue Representation from Language Models

This paper investigates pre-trained language models to find out which mo...
research
04/28/2022

Automatic Detection and Classification of Symbols in Engineering Drawings

A method of finding and classifying various components and objects in a ...
research
05/31/2019

Max-MIG: an Information Theoretic Approach for Joint Learning from Crowds

Eliciting labels from crowds is a potential way to obtain large labeled ...

Please sign up or login with your details

Forgot password? Click here to reset