SLiC-HF: Sequence Likelihood Calibration with Human Feedback

05/17/2023
by   Yao Zhao, et al.
1

Learning from human feedback has been shown to be effective at aligning language models with human preferences. Past work has often relied on Reinforcement Learning from Human Feedback (RLHF), which optimizes the language model using reward scores assigned from a reward model trained on human preference data. In this work we show how the recently introduced Sequence Likelihood Calibration (SLiC), can also be used to effectively learn from human preferences (SLiC-HF). Furthermore, we demonstrate this can be done with human feedback data collected for a different model, similar to off-policy, offline RL data. Automatic and human evaluation experiments on the TL;DR summarization task show that SLiC-HF significantly improves supervised fine-tuning baselines. Furthermore, SLiC-HF presents a competitive alternative to the PPO RLHF implementation used in past work while being much simpler to implement, easier to tune and more computationally efficient in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/23/2023

Aligning Language Models with Offline Reinforcement Learning from Human Feedback

Learning from human preferences is crucial for language models (LMs) to ...
research
09/13/2023

Statistical Rejection Sampling Improves Preference Optimization

Improving the alignment of language models with human preferences remain...
research
09/18/2019

Fine-Tuning Language Models from Human Preferences

Reward learning enables the application of reinforcement learning (RL) t...
research
04/12/2022

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

We apply preference modeling and reinforcement learning from human feedb...
research
05/19/2023

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

A centerpiece of the ever-popular reinforcement learning from human feed...
research
06/02/2023

Fine-Grained Human Feedback Gives Better Rewards for Language Model Training

Language models (LMs) often exhibit undesirable text generation behavior...
research
02/06/2023

Languages are Rewards: Chain of Hindsight Finetuning using Human Feedback

Learning from human preferences is important for language models to be h...

Please sign up or login with your details

Forgot password? Click here to reset