Improving reinforcement learning algorithms: towards optimal learning rate policies

11/06/2019
by   Othmane Mounjid, et al.
0

This paper investigates to what extent we can improve reinforcement learning algorithms. Our study is split in three parts. First, our analysis shows that the classical asymptotic convergence rate O(1/√(N)) is pessimistic and can be replaced by O((log(N)/N)^β) with 1/2≤β≤ 1 and N the number of iterations. Second, we propose a dynamic optimal policy for the choice of the learning rate (γ_k)_k≥ 0 used in stochastic algorithms. We decompose our policy into two interacting levels: the inner and the outer level. In the inner level, we present the PASS algorithm (for "PAst Sign Search") which, based on a predefined sequence (γ^o_k)_k≥ 0, constructs a new sequence (γ^i_k)_k≥ 0 whose error decreases faster. In the outer level, we propose an optimal methodology for the selection of the predefined sequence (γ^o_k)_k≥ 0. Third, we show empirically that our selection methodology of the learning rate outperforms significantly standard algorithms used in reinforcement learning (RL) in the three following applications: the estimation of a drift, the optimal placement of limit orders and the optimal execution of large number of shares.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2019

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

We study two time-scale linear stochastic approximation algorithms, whic...
research
12/12/2020

Tutoring Reinforcement Learning via Feedback Control

We introduce a control-tutored reinforcement learning (CTRL) algorithm. ...
research
10/23/2021

Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL

Recent works in Reinforcement Learning (RL) combine model-free (Mf)-RL a...
research
11/16/2022

Addressing the issue of stochastic environments and local decision-making in multi-objective reinforcement learning

Multi-objective reinforcement learning (MORL) is a relatively new field ...
research
06/18/2019

Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning

In real-world applications of reinforcement learning (RL), noise from in...
research
10/18/2019

Robust Learning Rate Selection for Stochastic Optimization via Splitting Diagnostic

This paper proposes SplitSGD, a new stochastic optimization algorithm wi...
research
08/23/2021

Robust Risk-Aware Reinforcement Learning

We present a reinforcement learning (RL) approach for robust optimisatio...

Please sign up or login with your details

Forgot password? Click here to reset