Fulfilling Formal Specifications ASAP by Model-free Reinforcement Learning

04/25/2023
by   Mengyu Liu, et al.
0

We propose a model-free reinforcement learning solution, namely the ASAP-Phi framework, to encourage an agent to fulfill a formal specification ASAP. The framework leverages a piece-wise reward function that assigns quantitative semantic reward to traces not satisfying the specification, and a high constant reward to the remaining. Then, it trains an agent with an actor-critic-based algorithm, such as soft actor-critic (SAC), or deep deterministic policy gradient (DDPG). Moreover, we prove that ASAP-Phi produces policies that prioritize fulfilling a specification ASAP. Extensive experiments are run, including ablation studies, on state-of-the-art benchmarks. Results show that our framework succeeds in finding sufficiently fast trajectories for up to 97% test cases and defeats baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2023

Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

The average reward criterion is relatively less studied as most existing...
research
12/11/2020

OPAC: Opportunistic Actor-Critic

Actor-critic methods, a type of model-free reinforcement learning (RL), ...
research
10/04/2020

FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning

In this paper, we propose a new type of Actor, named forward-looking Act...
research
08/05/2021

Deep Reinforcement Learning for Continuous Docking Control of Autonomous Underwater Vehicles: A Benchmarking Study

Docking control of an autonomous underwater vehicle (AUV) is a task that...
research
03/24/2022

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

Driving 3D characters to dance following a piece of music is highly chal...
research
05/30/2022

Critic Sequential Monte Carlo

We introduce CriticSMC, a new algorithm for planning as inference built ...
research
11/29/2019

Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation

In this paper, we describe NeurIPS 2019 Learning to Move - Walk Around c...

Please sign up or login with your details

Forgot password? Click here to reset