PACE: Improving Prompt with Actor-Critic Editing for Large Language Model

08/19/2023
by   Yihong Dong, et al.
0

Large language models (LLMs) have showcased remarkable potential across various tasks by conditioning on prompts. However, the quality of different human-written prompts leads to substantial discrepancies in LLMs' performance, and improving prompts usually necessitates considerable human effort and expertise. To this end, this paper proposes Prompt with Actor-Critic Editing (PACE) for LLMs to enable automatic prompt editing. Drawing inspiration from the actor-critic algorithm in reinforcement learning, PACE leverages LLMs as the dual roles of actors and critics, conceptualizing prompt as a type of policy. PACE refines prompt, taking into account the feedback from both actors performing prompt and critics criticizing response. This process helps LLMs better align prompt to a specific task, thanks to real responses and thinking from LLMs. We conduct extensive experiments on 24 instruction induction tasks and 21 big-bench tasks. Experimental results indicate that PACE elevates the relative performance of medium/low-quality human-written prompts by up to 98%, which has comparable performance to high-quality human-written prompts. Moreover, PACE also exhibits notable efficacy for prompt generation.

READ FULL TEXT

page 2

page 5

page 6

research
03/28/2018

Actor-Critic based Training Framework for Abstractive Summarization

We present a training framework for neural abstractive summarization bas...
research
02/23/2021

Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry

Actors and critics in actor-critic reinforcement learning algorithms are...
research
12/29/2017

Boosting the Actor with Dual Critic

This paper proposes a new actor-critic-style algorithm called Dual Actor...
research
04/17/2017

Pseudorehearsal in actor-critic agents

Catastrophic forgetting has a serious impact in reinforcement learning, ...
research
05/24/2023

Reinforcement Learning finetuned Vision-Code Transformer for UI-to-Code Generation

Automated HTML/CSS code generation from screenshots is an important yet ...
research
09/08/2023

Towards Reliable and Fluent Large Language Models: Incorporating Feedback Learning Loops in QA Systems

Large language models (LLMs) have emerged as versatile tools in various ...
research
05/04/2023

ChatGPT-steered Editing Instructor for Customization of Abstractive Summarization

Tailoring outputs of large language models, such as ChatGPT, to specific...

Please sign up or login with your details

Forgot password? Click here to reset