Verifiable Reinforcement Learning Systems via Compositionality

09/09/2023
by   Cyrus Neary, et al.
0

We propose a framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL subsystems, each of which learns to accomplish a separate subtask, are composed to achieve an overall task. The framework consists of a high-level model, represented as a parametric Markov decision process, which is used to plan and analyze compositions of subsystems, and of the collection of low-level subsystems themselves. The subsystems are implemented as deep RL agents operating under partial observability. By defining interfaces between the subsystems, the framework enables automatic decompositions of task specifications, e.g., reach a target set of states with a probability of at least 0.95, into individual subtask specifications, i.e. achieve the subsystem's exit conditions with at least some minimum probability, given that its entry conditions are met. This in turn allows for the independent training and testing of the subsystems. We present theoretical results guaranteeing that if each subsystem learns a policy satisfying its subtask specification, then their composition is guaranteed to satisfy the overall task specification. Conversely, if the subtask specifications cannot all be satisfied by the learned policies, we present a method, formulated as the problem of finding an optimal set of parameters in the high-level model, to automatically update the subtask specifications to account for the observed shortcomings. The result is an iterative procedure for defining subtask specifications, and for training the subsystems to meet them. Experimental results demonstrate the presented framework's novel capabilities in environments with both full and partial observability, discrete and continuous state and action spaces, as well as deterministic and stochastic dynamics.

READ FULL TEXT

page 15

page 23

research
06/07/2021

Verifiable and Compositional Reinforcement Learning Systems

We propose a novel framework for verifiable and compositional reinforcem...
research
09/23/2019

Modular Deep Reinforcement Learning with Temporal Logic Specifications

We propose an actor-critic, model-free, and online Reinforcement Learnin...
research
02/04/2022

Model-Free Reinforcement Learning for Symbolic Automata-encoded Objectives

Reinforcement learning (RL) is a popular approach for robotic path plann...
research
04/02/2020

Continuous Motion Planning with Temporal Logic Specifications using Deep Neural Networks

In this paper, we propose a model-free reinforcement learning method to ...
research
09/21/2022

LCRL: Certified Policy Synthesis via Logically-Constrained Reinforcement Learning

LCRL is a software tool that implements model-free Reinforcement Learnin...
research
10/30/2021

Adjacency constraint for efficient hierarchical reinforcement learning

Goal-conditioned Hierarchical Reinforcement Learning (HRL) is a promisin...
research
09/07/2016

Unifying task specification in reinforcement learning

Reinforcement learning tasks are typically specified as Markov decision ...

Please sign up or login with your details

Forgot password? Click here to reset