Reinforcement Learning with Temporal-Logic-Based Causal Diagrams

06/23/2023
by   Yash Paliwal, et al.
0

We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals. In this setting, a common approach is to represent the tasks as deterministic finite automata (DFA) and integrate them into the state-space for RL algorithms. However, while these machines model the reward function, they often overlook the causal knowledge about the environment. To address this limitation, we propose the Temporal-Logic-based Causal Diagram (TL-CD) in RL, which captures the temporal causal relationships between different properties of the environment. We exploit the TL-CD to devise an RL algorithm in which an agent requires significantly less exploration of the environment. To this end, based on a TL-CD and a task DFA, we identify configurations where the agent can determine the expected rewards early during an exploration. Through a series of case studies, we demonstrate the benefits of using TL-CDs, particularly the faster convergence of the algorithm to an optimal policy due to reduced exploration of the environment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2022

Logic-based Reward Shaping for Multi-Agent Reinforcement Learning

Reinforcement learning (RL) relies heavily on exploration to learn from ...
research
05/04/2020

Formal Policy Synthesis for Continuous-Space Systems via Reinforcement Learning

This paper studies data-driven techniques for satisfying temporal proper...
research
09/10/2019

Transfer of Temporal Logic Formulas in Reinforcement Learning

Transferring high-level knowledge from a source task to a target task is...
research
09/15/2020

Soft policy optimization using dual-track advantage estimator

In reinforcement learning (RL), we always expect the agent to explore as...
research
08/02/2022

Digital Twin-Assisted Efficient Reinforcement Learning for Edge Task Scheduling

Task scheduling is a critical problem when one user offloads multiple di...
research
10/29/2021

GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL

Out-of-distribution (OOD) detection is a well-studied topic in supervise...
research
12/06/2021

Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning

Deep reinforcement learning (RL) agents are becoming increasingly profic...

Please sign up or login with your details

Forgot password? Click here to reset