RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment

07/24/2023
by   Kevin Yang, et al.
0

We propose Reinforcement Learning from Contrast Distillation (RLCD), a method for aligning language models to follow natural language principles without using human feedback. RLCD trains a preference model using simulated preference pairs that contain both a high-quality and low-quality example, generated using contrasting positive and negative prompts. The preference model is then used to improve a base unaligned language model via reinforcement learning. Empirically, RLCD outperforms RLAIF (Bai et al., 2022b) and context distillation (Huang et al., 2022) baselines across three diverse alignment tasks–harmlessness, helpfulness, and story outline generation–and on both 7B and 30B model scales for preference data simulation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2021

Implicit Unlikelihood Training: Improving Neural Text Generation with Reinforcement Learning

Likelihood training and maximization-based decoding result in dull and r...
research
12/20/2022

On Improving Summarization Factual Consistency from Natural Language Feedback

Despite the recent progress in language generation models, their outputs...
research
12/16/2021

Goal-Directed Story Generation: Augmenting Generative Language Models with Reinforcement Learning

The advent of large pre-trained generative language models has provided ...
research
03/22/2023

Can we trust the evaluation on ChatGPT?

ChatGPT, the first large language model (LLM) with mass adoption, has de...
research
09/13/2023

Statistical Rejection Sampling Improves Preference Optimization

Improving the alignment of language models with human preferences remain...
research
06/30/2023

Preference Ranking Optimization for Human Alignment

Large language models (LLMs) often contain misleading content, emphasizi...
research
09/28/2022

Improving alignment of dialogue agents via targeted human judgements

We present Sparrow, an information-seeking dialogue agent trained to be ...

Please sign up or login with your details

Forgot password? Click here to reset