Formal Language Constraints for Markov Decision Processes

10/02/2019
by   Eleanor Quint, et al.
0

In order to satisfy safety conditions, a reinforcement learned (RL) agent maybe constrained from acting freely, e.g., to prevent trajectories that might cause unwanted behavior or physical damage in a robot. We propose a general framework for augmenting a Markov decision process (MDP) with constraints that are described in formal languages over sequences of MDP states and agent actions. Constraint enforcement is implemented by filtering the allowed action set or by applying potential-based reward shaping to implement hard and soft constraint enforcement, respectively. We instantiate this framework using deterministic finite automata to encode constraints and propose methods of augmenting MDP observations with the state of the constraint automaton for learning. We empirically evaluate these methods with a variety of constraints by training Deep Q-Networks in Atari games as well as Proximal Policy Optimization in MuJoCo environments. We experimentally find that our approaches are effective in significantly reducing or eliminating constraint violations with either minimal negative or, depending on the constraint, a clear positive impact on final performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/23/2019

Reinforcement Learning of Markov Decision Processes with Peak Constraints

In this paper, we consider reinforcement learning of Markov Decision Pro...
research
09/20/2019

Reconnaissance and Planning algorithm for constrained MDP

Practical reinforcement learning problems are often formulated as constr...
research
08/26/2020

Constrained Markov Decision Processes via Backward Value Functions

Although Reinforcement Learning (RL) algorithms have found tremendous su...
research
06/15/2016

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

In classical reinforcement learning, when exploring an environment, agen...
research
06/15/2018

Learning 6-DoF Grasping and Pick-Place Using Attention Focus

We address a class of manipulation problems where the robot perceives th...
research
02/21/2022

Learning Behavioral Soft Constraints from Demonstrations

Many real-life scenarios require humans to make difficult trade-offs: do...
research
02/15/2021

How RL Agents Behave When Their Actions Are Modified

Reinforcement learning in complex environments may require supervision t...

Please sign up or login with your details

Forgot password? Click here to reset