A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

12/10/2019
by   Pan Xu, et al.
17

Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning remains virtually unknown. In this paper, we present a finite-time analysis of a neural Q-learning algorithm, where the data are generated from a Markov decision process and the action-value function is approximated by a deep ReLU neural network. We prove that neural Q-learning finds the optimal policy with O(1/√(T)) convergence rate if the neural function approximator is sufficiently overparameterized, where T is the number of iterations. To our best knowledge, our result is the first finite-time analysis of neural Q-learning under non-i.i.d. data assumption.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2022

On the connection between Bregman divergence and value in regularized Markov decision processes

In this short note we derive a relationship between the Bregman divergen...
research
04/20/2022

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations

Despite the success of reinforcement learning (RL) for Markov decision p...
research
04/15/2021

An L^2 Analysis of Reinforcement Learning in High Dimensions with Kernel and Neural Network Approximation

Reinforcement learning (RL) algorithms based on high-dimensional functio...
research
09/29/2020

Finite-Time Analysis for Double Q-learning

Although Q-learning is one of the most successful algorithms for finding...
research
11/20/2019

A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound

Policy evaluation in reinforcement learning is often conducted using two...
research
02/14/2019

On Reinforcement Learning Using Monte Carlo Tree Search with Supervised Learning: Non-Asymptotic Analysis

Inspired by the success of AlphaGo Zero (AGZ) which utilizes Monte Carlo...
research
01/01/2019

A Theoretical Analysis of Deep Q-Learning

Despite the great empirical success of deep reinforcement learning, its ...

Please sign up or login with your details

Forgot password? Click here to reset