Automatic Exploration Process Adjustment for Safe Reinforcement Learning with Joint Chance Constraint Satisfaction

03/05/2021
by   Yoshihiro Okawa, et al.
0

In reinforcement learning (RL) algorithms, exploratory control inputs are used during learning to acquire knowledge for decision making and control, while the true dynamics of a controlled object is unknown. However, this exploring property sometimes causes undesired situations by violating constraints regarding the state of the controlled object. In this paper, we propose an automatic exploration process adjustment method for safe RL in continuous state and action spaces utilizing a linear nominal model of the controlled object. Specifically, our proposed method automatically selects whether the exploratory input is used or not at each time depending on the state and its predicted value as well as adjusts the variance-covariance matrix used in the Gaussian policy for exploration. We also show that our exploration process adjustment method theoretically guarantees the satisfaction of the constraints with the pre-specified probability, that is, the satisfaction of a joint chance constraint at every time. Finally, we illustrate the validity and the effectiveness of our method through numerical simulation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2022

Safe Exploration Method for Reinforcement Learning under Existence of Disturbance

Recent rapid developments in reinforcement learning algorithms have been...
research
07/17/2022

Robust Action Governor for Uncertain Piecewise Affine Systems with Non-convex Constraints and Safe Reinforcement Learning

The action governor is an add-on scheme to a nominal control loop that m...
research
04/23/2021

Safe Chance Constrained Reinforcement Learning for Batch Process Control

Reinforcement Learning (RL) controllers have generated excitement within...
research
04/14/2021

Safe Continuous Control with Constrained Model-Based Policy Optimization

The applicability of reinforcement learning (RL) algorithms in real-worl...
research
07/08/2022

Safe reinforcement learning for multi-energy management systems with known constraint functions

Reinforcement learning (RL) is a promising optimal control technique for...
research
07/26/2019

Large scale continuous-time mean-variance portfolio allocation via reinforcement learning

We propose to solve large scale Markowitz mean-variance (MV) portfolio a...
research
09/17/2022

Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems

Recently, self-learning methods based on user satisfaction metrics and c...

Please sign up or login with your details

Forgot password? Click here to reset