Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning

01/21/2022
by   Sergio Rozada, et al.
0

Value-function (VF) approximation is a central problem in Reinforcement Learning (RL). Classical non-parametric VF estimation suffers from the curse of dimensionality. As a result, parsimonious parametric models have been adopted to approximate VFs in high-dimensional spaces, with most efforts being focused on linear and neural-network-based approaches. Differently, this paper puts forth a a parsimonious non-parametric approach, where we use stochastic low-rank algorithms to estimate the VF matrix in an online and model-free fashion. Furthermore, as VFs tend to be multi-dimensional, we propose replacing the classical VF matrix representation with a tensor (multi-way array) representation and, then, use the PARAFAC decomposition to design an online model-free tensor low-rank algorithm. Different versions of the algorithms are proposed, their complexity is analyzed, and their performance is assessed numerically using standardized RL environments.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset