Co-learning Planning and Control Policies Using Differentiable Formal Task Constraints

03/02/2023
by   Zikang Xiong, et al.
0

This paper presents a hierarchical reinforcement learning algorithm constrained by differentiable signal temporal logic. Previous work on logic-constrained reinforcement learning consider encoding these constraints with a reward function, constraining policy updates with a sample-based policy gradient. However, such techniques oftentimes tend to be inefficient because of the significant number of samples required to obtain accurate policy gradients. In this paper, instead of implicitly constraining policy search with sample-based policy gradients, we directly constrain policy search by backpropagating through formal constraints, enabling training hierarchical policies with substantially fewer training samples. The use of hierarchical policies is recognized as a crucial component of reinforcement learning with task constraints. We show that we can stably constrain policy updates, thus enabling different levels of the policy to be learned simultaneously, yielding superior performance compared with training them separately. Experiment results on several simulated high-dimensional robot dynamics and a real-world differential drive robot (TurtleBot3) demonstrate the effectiveness of our approach on five different types of task constraints. Demo videos, code, and models can be found at our project website: https://sites.google.com/view/dscrl

READ FULL TEXT

page 1

page 4

page 5

research
05/13/2019

Learning Novel Policies For Tasks

In this work, we present a reinforcement learning algorithm that can fin...
research
09/30/2022

Efficiently Learning Small Policies for Locomotion and Manipulation

Neural control of memory-constrained, agile robots requires small, yet h...
research
11/15/2022

Automatic Evaluation of Excavator Operators using Learned Reward Functions

Training novice users to operate an excavator for learning different ski...
research
01/28/2019

Lyapunov-based Safe Policy Optimization for Continuous Control

We study continuous action reinforcement learning problems in which it i...
research
10/11/2020

Safe Reinforcement Learning with Natural Language Constraints

In this paper, we tackle the problem of learning control policies for ta...
research
03/06/2022

Leveraging Reward Gradients For Reinforcement Learning in Differentiable Physics Simulations

In recent years, fully differentiable rigid body physics simulators have...
research
05/07/2020

Cascade Attribute Network: Decomposing Reinforcement Learning Control Policies using Hierarchical Neural Networks

Reinforcement learning methods have been developed to achieve great succ...

Please sign up or login with your details

Forgot password? Click here to reset