Concentration of Contractive Stochastic Approximation and Reinforcement Learning

06/27/2021

∙

by Siddharth Chandak, et al.

∙

Using a martingale concentration inequality, concentration bounds `from time n_0 on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).

READ FULL TEXT

Concentration of Contractive Stochastic Approximation and Reinforcement Learning

Sign in with Google

Consider DeepAI Pro