Reinforcement Learning from Diverse Human Preferences

01/27/2023
by   Wanqi Xue, et al.
0

The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques. Describing an agent's desired behaviors and properties can be difficult, even for experts. A new paradigm called reinforcement learning from human preferences (or preference-based RL) has emerged as a promising solution, in which reward functions are learned from human preference labels among behavior trajectories. However, existing methods for preference-based RL are limited by the need for accurate oracle preference labels. This paper addresses this limitation by developing a method for crowd-sourcing preference labels and learning from diverse human preferences. The key idea is to stabilize reward learning through regularization and correction in a latent space. To ensure temporal consistency, a strong constraint is imposed on the reward model that forces its latent space to be close to the prior distribution. Additionally, a confidence-based reward model ensembling method is designed to generate more stable and reliable predictions. The proposed method is tested on a variety of tasks in DMcontrol and Meta-world and has shown consistent and significant improvements over existing preference-based RL algorithms when learning from diverse feedback, paving the way for real-world applications of RL methods.

READ FULL TEXT

page 4

page 7

research
05/24/2022

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

Conveying complex objectives to reinforcement learning (RL) agents often...
research
06/25/2023

Is RLHF More Difficult than Standard RL?

Reinforcement learning from Human Feedback (RLHF) learns from preference...
research
06/06/2023

Zero-shot Preference Learning for Offline RL via Optimal Transport

Preference-based Reinforcement Learning (PbRL) has demonstrated remarkab...
research
12/15/2022

Constitutional AI: Harmlessness from AI Feedback

As AI systems become more capable, we would like to enlist their help to...
research
01/22/2021

Prior Preference Learning from Experts:Designing a Reward with Active Inference

Active inference may be defined as Bayesian modeling of a brain with a b...
research
07/19/2023

STRAPPER: Preference-based Reinforcement Learning via Self-training Augmentation and Peer Regularization

Preference-based reinforcement learning (PbRL) promises to learn a compl...
research
06/16/2020

Preference-based Reinforcement Learning with Finite-Time Guarantees

Preference-based Reinforcement Learning (PbRL) replaces reward values in...

Please sign up or login with your details

Forgot password? Click here to reset