Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

02/17/2020
by   Alberto Maria Metelli, et al.
0

The choice of the control frequency of a system has a relevant impact on the ability of reinforcement learning algorithms to learn a highly performing policy. In this paper, we introduce the notion of action persistence that consists in the repetition of an action for a fixed number of decision steps, having the effect of modifying the control frequency. We start analyzing how action persistence affects the performance of the optimal policy, and then we present a novel algorithm, Persistent Fitted Q-Iteration (PFQI), that extends FQI, with the goal of learning the optimal value function at a given persistence. After having provided a theoretical study of PFQI and a heuristic approach to identify the optimal persistence, we present an experimental campaign on benchmark domains to show the advantages of action persistence and proving the effectiveness of our persistence selection method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2022

Simultaneously Updating All Persistence Values in Reinforcement Learning

In reinforcement learning, the performance of learning agents is highly ...
research
04/10/2019

Persistence-perfect discrete gradient vector fields and multi-parameter persistence

The main objective of this paper is to introduce and study a notion of p...
research
11/24/2021

A comment on stabilizing reinforcement learning

This is a short comment on the paper "Asymptotically Stable Adaptive-Opt...
research
09/13/2019

HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints

Learning optimal feedback control laws capable of executing optimal traj...
research
04/06/2020

On the Persistence of Persistent Identifiers of the Scholarly Web

Scholarly resources, just like any other resources on the web, are subje...
research
07/04/2022

Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions

Value iteration (VI) is a foundational dynamic programming method, impor...
research
02/14/2020

Frequency-based Search-control in Dyna

Model-based reinforcement learning has been empirically demonstrated as ...

Please sign up or login with your details

Forgot password? Click here to reset