Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

12/30/2021
by   Tong Mu, et al.
0

Online reinforcement learning (RL) algorithms are often difficult to deploy in complex human-facing applications as they may learn slowly and have poor early performance. To address this, we introduce a practical algorithm for incorporating human insight to speed learning. Our algorithm, Constraint Sampling Reinforcement Learning (CSRL), incorporates prior domain knowledge as constraints/restrictions on the RL policy. It takes in multiple potential policy constraints to maintain robustness to misspecification of individual constraints while leveraging helpful ones to learn quickly. Given a base RL learning algorithm (ex. UCRL, DQN, Rainbow) we propose an upper confidence with elimination scheme that leverages the relationship between the constraints, and their observed performance, to adaptively switch among them. We instantiate our algorithm with DQN-type algorithms and UCRL as base algorithms, and evaluate our algorithm in four environments, including three simulators based on real data: recommendations, educational activity sequencing, and HIV treatment sequencing. In all cases, CSRL learns a good policy faster than baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2022

In-context Reinforcement Learning with Algorithm Distillation

We propose Algorithm Distillation (AD), a method for distilling reinforc...
research
02/16/2021

Transferring Domain Knowledge with an Adviser in Continuous Tasks

Recent advances in Reinforcement Learning (RL) have surpassed human-leve...
research
04/30/2023

Joint Learning of Policy with Unknown Temporal Constraints for Safe Reinforcement Learning

In many real-world applications, safety constraints for reinforcement le...
research
02/14/2020

Robust Reinforcement Learning via Adversarial training with Langevin Dynamics

We introduce a sampling perspective to tackle the challenging task of tr...
research
11/24/2019

ORL: Reinforcement Learning Benchmarks for Online Stochastic Optimization Problems

Reinforcement Learning (RL) has achieved state-of-the-art results in dom...
research
01/19/2021

Spatial Assembly: Generative Architecture With Reinforcement Learning, Self Play and Tree Search

With this work, we investigate the use of Reinforcement Learning (RL) fo...
research
04/02/2020

Value Driven Representation for Human-in-the-Loop Reinforcement Learning

Interactive adaptive systems powered by Reinforcement Learning (RL) have...

Please sign up or login with your details

Forgot password? Click here to reset