Learning Control Policies for Stochastic Systems with Reach-avoid Guarantees

10/11/2022
by   Đorđe Žikelić, et al.
0

We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold p∈[0,1] over the infinite time horizon. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems – it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on 3 stochastic non-linear reinforcement learning tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2022

KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems

Learning a dynamical system requires stabilizing the unknown dynamics to...
research
12/17/2021

Stability Verification in Stochastic Control Systems via Neural Network Supermartingales

We consider the problem of formally verifying almost-sure (a.s.) asympto...
research
07/30/2018

Reach-Avoid Problems via Sum-of-Squares Optimization and Dynamic Programming

Reach-avoid problems involve driving a system to a set of desirable conf...
research
09/16/2023

Data-driven Reachability using Christoffel Functions and Conformal Prediction

An important mathematical tool in the analysis of dynamical systems is t...
research
10/11/2022

Learning Control Policies for Region Stabilization in Stochastic Systems

We consider the problem of learning control policies in stochastic syste...
research
11/04/2021

Infinite Time Horizon Safety of Bayesian Neural Networks

Bayesian neural networks (BNNs) place distributions over the weights of ...
research
12/06/2018

On the stability analysis of optimal state feedbacks as represented by deep neural models

Research has shown how the optimal feedback control of several non linea...

Please sign up or login with your details

Forgot password? Click here to reset