Eventual Discounting Temporal Logic Counterfactual Experience Replay

03/03/2023
by   Cameron Voloshin, et al.
0

Linear temporal logic (LTL) offers a simplified way of specifying tasks for policy optimization that may otherwise be difficult to describe with scalar reward functions. However, the standard RL framework can be too myopic to find maximally LTL satisfying policies. This paper makes two contributions. First, we develop a new value-function based proxy, using a technique we call eventual discounting, under which one can find policies that satisfy the LTL specification with highest achievable probability. Second, we develop a new experience replay method for generating off-policy data from on-policy rollouts via counterfactual reasoning on different ways of satisfying the LTL specification. Our experiments, conducted in both discrete and continuous state-action spaces, confirm the effectiveness of our counterfactual experience replay approach.

READ FULL TEXT

page 3

page 7

research
06/19/2019

Experience Replay Optimization

Experience replay enables reinforcement learning agents to memorize and ...
research
05/15/2018

Advances in Experience Replay

This project combines recent advances in experience replay techniques, n...
research
07/27/2022

Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms

Learning in high dimensional continuous tasks is challenging, mainly whe...
research
03/06/2018

Smoothed Action Value Functions for Learning Gaussian Policies

State-action value functions (i.e., Q-values) are ubiquitous in reinforc...
research
02/05/2021

Revisiting Prioritized Experience Replay: A Value Perspective

Experience replay enables off-policy reinforcement learning (RL) agents ...
research
06/20/2022

Policy Optimization with Linear Temporal Logic Constraints

We study the problem of policy optimization (PO) with linear temporal lo...
research
02/07/2019

Cost-Effective Incentive Allocation via Structured Counterfactual Inference

We address a practical problem ubiquitous in modern industry, in which a...

Please sign up or login with your details

Forgot password? Click here to reset