Constrained Upper Confidence Reinforcement Learning

01/26/2020
by   Liyuan Zheng, et al.
0

Constrained Markov Decision Processes are a class of stochastic decision problems in which the decision maker must select a policy that satisfies auxiliary cost constraints. This paper extends upper confidence reinforcement learning for settings in which the reward function and the constraints, described by cost functions, are unknown a priori but the transition kernel is known. Such a setting is well-motivated by a number of applications including exploration of unknown, potentially unsafe, environments. We present an algorithm C-UCRL and show that it achieves sub-linear regret (O(T^3/4√(log(T/δ)))) with respect to the reward while satisfying the constraints even while learning with probability 1-δ. Illustrative examples are provided.

READ FULL TEXT
research
06/01/2023

Identifiability and Generalizability in Constrained Inverse Reinforcement Learning

Two main challenges in Reinforcement Learning (RL) are designing appropr...
research
04/20/2020

Tightening Exploration in Upper Confidence Reinforcement Learning

The upper confidence reinforcement learning (UCRL2) strategy introduced ...
research
06/17/2023

FP-IRL: Fokker-Planck-based Inverse Reinforcement Learning – A Physics-Constrained Approach to Markov Decision Processes

Inverse Reinforcement Learning (IRL) is a compelling technique for revea...
research
08/28/2023

On Reward Structures of Markov Decision Processes

A Markov decision process can be parameterized by a transition kernel an...
research
08/20/2021

Plug and Play, Model-Based Reinforcement Learning

Sample-efficient generalisation of reinforcement learning approaches hav...
research
05/26/2023

Policy Synthesis and Reinforcement Learning for Discounted LTL

The difficulty of manually specifying reward functions has led to an int...

Please sign up or login with your details

Forgot password? Click here to reset