An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method

06/15/2018
by   Ajin George Joseph, et al.
0

In this paper, we provide two new stable online algorithms for the problem of prediction in reinforcement learning, i.e., estimating the value function of a model-free Markov reward process using the linear function approximation architecture and with memory and computation costs scaling quadratically in the size of the feature set. The algorithms employ the multi-timescale stochastic approximation variant of the very popular cross entropy (CE) optimization method which is a model based search method to find the global optimum of a real-valued function. A proof of convergence of the algorithms using the ODE method is provided. We supplement our theoretical results with experimental comparisons. The algorithms achieve good performance fairly consistently on many RL benchmark problems with regards to computational efficiency, accuracy and stability.

READ FULL TEXT
research
01/31/2018

A Cross Entropy based Optimization Algorithm with Global Convergence Guarantees

The cross entropy (CE) method is a model based search method to solve op...
research
03/25/2019

Q-Learning for Continuous Actions with Cross-Entropy Guided Policies

Off-Policy reinforcement learning (RL) is an important class of methods ...
research
10/15/2020

Safe Model-based Reinforcement Learning with Robust Cross-Entropy Method

This paper studies the safe reinforcement learning (RL) problem without ...
research
08/14/2020

Sample-efficient Cross-Entropy Method for Real-time Planning

Trajectory optimizers for model-based reinforcement learning, such as th...
research
07/01/2015

An Empirical Evaluation of True Online TD(λ)

The true online TD(λ) algorithm has recently been proposed (van Seijen a...
research
01/21/2022

Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning

Value-function (VF) approximation is a central problem in Reinforcement ...
research
06/03/2011

Efficient Reinforcement Learning Using Recursive Least-Squares Methods

The recursive least-squares (RLS) algorithm is one of the most well-know...

Please sign up or login with your details

Forgot password? Click here to reset