Constrained episodic reinforcement learning in concave-convex and knapsack settings

by   Kianté Brantley, et al.

We propose an algorithm for tabular episodic reinforcement learning with constraints. We provide a modular analysis with strong theoretical guarantees for settings with concave rewards and convex constraints, and for settings with hard constraints (knapsacks). Most of the previous work in constrained reinforcement learning is limited to linear constraints, and the remaining work focuses on either the feasibility question or settings with a single episode. Our experiments demonstrate that the proposed algorithm significantly outperforms these approaches in existing constrained episodic environments.


page 1

page 2

page 3

page 4


Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy Optimization

Action-constrained reinforcement learning (RL) is a widely-used approach...

Concave Utility Reinforcement Learning with Zero-Constraint Violations

We consider the problem of tabular infinite horizon concave utility rein...

Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification

Many real-world physical control systems are required to satisfy constra...

Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks

This paper focuses on finding reinforcement learning policies for contro...

A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning

Evolutionary strategies have recently been shown to achieve competing le...

Density Constrained Reinforcement Learning

We study constrained reinforcement learning (CRL) from a novel perspecti...

Automating Personnel Rostering by Learning Constraints Using Tensors

Many problems in operations research require that constraints be specifi...