ZYN: Zero-Shot Reward Models with Yes-No Questions

08/11/2023
by   Víctor Gallego, et al.
0

In this work, we address the problem of directing the text generations of a LLM towards a desired behavior, aligning the generated text with the preferences of the human operator. We propose using another language model as a critic, reward model in a zero-shot way thanks to the prompt of a Yes-No question that represents the user preferences, without requiring further labeled data. This zero-shot reward model provides the learning signal to further fine-tune the base LLM using reinforcement learning, as in RLAIF; yet our approach is also compatible in other contexts such as quality-diversity search. Extensive evidence of the capabilities of the proposed ZYN framework is provided through experiments in different domains related to text generation, including detoxification; optimizing sentiment of movie reviews, or any other attribute; steering the opinion about a particular topic the model may have; and personalizing prompt generators for text-to-image tasks. Code to be released at <https://github.com/vicgalle/zero-shot-reward-models/>.

READ FULL TEXT

page 9

page 16

research
07/22/2022

Zero-Shot Video Captioning with Evolving Pseudo-Tokens

We introduce a zero-shot video captioning method that employs two frozen...
research
03/15/2023

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

Large scale Vision-Language (VL) models have shown tremendous success in...
research
10/28/2022

OhMG: Zero-shot Open-vocabulary Human Motion Generation

Generating motion in line with text has attracted increasing attention n...
research
11/10/2022

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control

Pretrained language models have demonstrated extraordinary capabilities ...
research
12/21/2022

Critic-Guided Decoding for Controlled Text Generation

Steering language generation towards objectives or away from undesired c...
research
06/29/2023

ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles

Automatically generating textual content with desired attributes is an a...
research
10/25/2021

Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection

Dialogue disentanglement aims to group utterances in a long and multi-pa...

Please sign up or login with your details

Forgot password? Click here to reset