Large Language Models Are Human-Level Prompt Engineers

11/03/2022
by   Yongchao Zhou, et al.
0

By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model, and most effective prompts have been handcrafted by humans. Inspired by classical program synthesis and the human approach to prompt engineering, we propose Automatic Prompt Engineer (APE) for automatic instruction generation and selection. In our method, we treat the instruction as the "program," optimized by searching over a pool of instruction candidates proposed by an LLM in order to maximize a chosen score function. To evaluate the quality of the selected instruction, we evaluate the zero-shot performance of another LLM following the selected instruction. Experiments on 24 NLP tasks show that our automatically generated instructions outperform the prior LLM baseline by a large margin and achieve better or comparable performance to the instructions generated by human annotators on 19/24 tasks. We conduct extensive qualitative and quantitative analyses to explore the performance of APE. We show that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness, as well as to improve few-shot learning performance by simply prepending them to standard in-context learning prompts. Please check out our webpage at https://sites.google.com/view/automatic-prompt-engineer.

READ FULL TEXT

page 28

page 29

page 30

page 31

page 32

page 33

page 36

page 37

research
09/03/2021

Finetuned Language Models Are Zero-Shot Learners

This paper explores a simple method for improving the zero-shot learning...
research
07/13/2023

AutoHint: Automatic Prompt Optimization with Hint Generation

This paper presents AutoHint, a novel framework for automatic prompt eng...
research
05/23/2023

Robust Instruction Optimization for Large Language Models with Distribution Shifts

Large Language Models have demonstrated significant ability in accomplis...
research
05/04/2023

Automatic Prompt Optimization with "Gradient Descent" and Beam Search

Large Language Models (LLMs) have shown impressive performance as genera...
research
03/14/2022

GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models

Providing natural language instructions in prompts is a useful new parad...
research
01/17/2023

Are Language Models Worse than Humans at Following Prompts? It's Complicated

Prompts have been the center of progress in advancing language models' z...
research
05/21/2023

Automated Few-shot Classification with Instruction-Finetuned Language Models

A particularly successful class of approaches for few-shot learning comb...

Please sign up or login with your details

Forgot password? Click here to reset