Reinforcement Learning Under Probabilistic Spatio-Temporal Constraints with Time Windows

07/29/2023
by   Xiaoshan Lin, et al.
0

We propose an automata-theoretic approach for reinforcement learning (RL) under complex spatio-temporal constraints with time windows. The problem is formulated using a Markov decision process under a bounded temporal logic constraint. Different from existing RL methods that can eventually learn optimal policies satisfying such constraints, our proposed approach enforces a desired probability of constraint satisfaction throughout learning. This is achieved by translating the bounded temporal logic constraint into a total automaton and avoiding "unsafe" actions based on the available prior information regarding the transition probabilities, i.e., a pair of upper and lower bounds for each transition probability. We provide theoretical guarantees on the resulting probability of constraint satisfaction. We also provide numerical results in a scenario where a robot explores the environment to discover high-reward regions while fulfilling some periodic pick-up and delivery tasks that are encoded as temporal logic constraints.

READ FULL TEXT
research
02/19/2021

Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints During Reinforcement Learning

We present a novel reinforcement learning algorithm for finding optimal ...
research
09/11/2019

Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees

Reinforcement Learning (RL) has emerged as an efficient method of choice...
research
10/14/2020

Reinforcement Learning Based Temporal Logic Control with Maximum Probabilistic Satisfaction

This paper presents a model-free reinforcement learning (RL) algorithm t...
research
05/26/2023

Policy Synthesis and Reinforcement Learning for Discounted LTL

The difficulty of manually specifying reward functions has led to an int...
research
10/02/2020

Model-Free Reinforcement Learning for Stochastic Games with Linear Temporal Logic Objectives

We study the problem of synthesizing control strategies for Linear Tempo...
research
12/11/2016

Reinforcement Learning With Temporal Logic Rewards

Reinforcement learning (RL) depends critically on the choice of reward f...
research
01/29/2019

Constraint Satisfaction Propagation: Non-stationary Policy Synthesis for Temporal Logic Planning

Problems arise when using reward functions to capture dependencies betwe...

Please sign up or login with your details

Forgot password? Click here to reset