Assured RL: Reinforcement Learning with Almost Sure Constraints

by   Agustin Castellano, et al.

We consider the problem of finding optimal policies for a Markov Decision Process with almost sure constraints on state transitions and action triplets. We define value and action-value functions that satisfy a barrier-based decomposition which allows for the identification of feasible policies independently of the reward process. We prove that, given a policy π, certifying whether certain state-action pairs lead to feasible trajectories under π is equivalent to solving an auxiliary problem aimed at finding the probability of performing an unfeasible transition. Using this interpretation,we develop a Barrier-learning algorithm, based on Q-Learning, that identifies such unsafe state-action pairs. Our analysis motivates the need to enhance the Reinforcement Learning (RL) framework with an additional signal, besides rewards, called here damage function that provides feasibility information and enables the solution of RL problems with model-free constraints. Moreover, our Barrier-learning algorithm wraps around existing RL algorithms, such as Q-Learning and SARSA, giving them the ability to solve almost-surely constrained problems.



There are no comments yet.


page 1

page 2

page 3

page 4


Learning to Act Safely with Limited Exposure and Almost Sure Certainty

This paper aims to put forward the concept that learning to take safe ac...

State Action Separable Reinforcement Learning

Reinforcement Learning (RL) based methods have seen their paramount succ...

Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies

Standard reinforcement learning (RL) aims to find an optimal policy that...

Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

Model-based reinforcement learning (RL) is more sample efficient than mo...

Reinforcement Learning with Almost Sure Constraints

In this work we address the problem of finding feasible policies for Con...

Recall Traces: Backtracking Models for Efficient Reinforcement Learning

In many environments only a tiny subset of all states yield high reward....

Spatial Assembly: Generative Architecture With Reinforcement Learning, Self Play and Tree Search

With this work, we investigate the use of Reinforcement Learning (RL) fo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.