On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning

01/30/2021
by   Alain Durmus, et al.
0

This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions and uniform ergodicity of the Markov chains. Our main contribution is an exponential stability result for the p-th moment of random matrix product, provided that (i) the underlying Markov chain satisfies a super-Lyapunov drift condition, (ii) the growth of the matrix-valued functions is controlled by an appropriately defined function (related to the drift condition). Using this result, we give finite-time p-th moment bounds for constant and decreasing stepsize linear stochastic approximation schemes with Markovian noise on general state space. We illustrate these findings for linear value-function estimation in reinforcement learning. We provide finite-time p-th moment bound for various members of temporal difference (TD) family of algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2021

The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning

The paper concerns convergence and asymptotic statistics for stochastic ...
research
09/10/2019

A Multistep Lyapunov Approach for Finite-Time Analysis of Biased Stochastic Approximation

Motivated by the widespread use of temporal-difference (TD-) and Q-learn...
research
05/27/2019

Finite-Time Analysis of Q-Learning with Linear Function Approximation

In this paper, we consider the model-free reinforcement learning problem...
research
06/08/2020

Stable Reinforcement Learning with Unbounded State Space

We consider the problem of reinforcement learning (RL) with unbounded st...
research
07/05/2023

Stability of Q-Learning Through Design and Optimism

Q-learning has become an important part of the reinforcement learning to...
research
06/28/2023

On the surprising effectiveness of a simple matrix exponential derivative approximation, with application to global SARS-CoV-2

The continuous-time Markov chain (CTMC) is the mathematical workhorse of...
research
10/06/2020

Reinforcement Learning in Deep Structured Teams: Initial Results with Finite and Infinite Valued Features

In this paper, we consider Markov chain and linear quadratic models for ...

Please sign up or login with your details

Forgot password? Click here to reset