From STL Rulebooks to Rewards

10/06/2021
by   Edgar A. Aguilar, et al.
4

The automatic synthesis of neural-network controllers for autonomous agents through reinforcement learning has to simultaneously optimize many, possibly conflicting, objectives of various importance. This multi-objective optimization task is reflected in the shape of the reward function, which is most often the result of an ad-hoc and crafty-like activity. In this paper we propose a principled approach to shaping rewards for reinforcement learning from multiple objectives that are given as a partially-ordered set of signal-temporal-logic (STL) rules. To this end, we first equip STL with a novel quantitative semantics allowing to automatically evaluate individual requirements. We then develop a method for systematically combining evaluations of multiple requirements into a single reward that takes into account the priorities defined by the partial order. We finally evaluate our approach on several case studies, demonstrating its practical applicability.

READ FULL TEXT

page 3

page 11

research
02/21/2022

Inferring Lexicographically-Ordered Rewards from Preferences

Modeling the preferences of agents over a set of alternatives is a princ...
research
12/02/2022

STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning

Deep Reinforcement Learning (DRL) has the potential to be used for synth...
research
04/11/2022

gTLO: A Generalized and Non-linear Multi-Objective Deep Reinforcement Learning Approach

In real-world decision optimization, often multiple competing objectives...
research
06/16/2021

Mungojerrie: Reinforcement Learning of Linear-Time Objectives

Reinforcement learning synthesizes controllers without prior knowledge o...
research
04/15/2022

The Importance of Credo in Multiagent Learning

We propose a model for multi-objective optimization, a credo, for agents...
research
02/03/2020

Effective Diversity in Population-Based Reinforcement Learning

Maintaining a population of solutions has been shown to increase explora...
research
12/19/2019

Extendable NFV-Integrated Control Method Using Reinforcement Learning

Network functions virtualization (NFV) enables telecommunications servic...

Please sign up or login with your details

Forgot password? Click here to reset