On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization

by   Mudit Gaur, et al.

Deep Q-learning based algorithms have been applied successfully in many decision making problems, while their theoretical foundations are not as well understood. In this paper, we study a Fitted Q-Iteration with two-layer ReLU neural network parametrization, and find the sample complexity guarantees for the algorithm. The approach estimates the Q-function in each iteration using a convex optimization problem. We show that this approach achieves a sample complexity of 𝒊Ėƒ(1/Ïĩ^2), which is order-optimal. This result holds for a countable state-space and does not require any assumptions such as a linear or low rank structure on the MDP.


page 1

page 2

page 3

page 4

∙ 06/18/2023

On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization

Actor-critic algorithms have shown remarkable success in solving state-o...
∙ 06/07/2022

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

The practicality of reinforcement learning algorithms has been limited d...
∙ 03/02/2021

Sample Complexity and Overparameterization Bounds for Projection-Free Neural TD Learning

We study the dynamics of temporal-difference learning with neural networ...
∙ 12/12/2022

Variance-Reduced Conservative Policy Iteration

We study the sample complexity of reducing reinforcement learning to a s...
∙ 05/27/2022

Optimizing Objective Functions from Trained ReLU Neural Networks via Sampling

This paper introduces scalable, sampling-based algorithms that optimize ...
∙ 06/13/2019

Variance Estimation For Online Regression via Spectrum Thresholding

We consider the online linear regression problem, where the predictor ve...
∙ 10/14/2020

Polar Deconvolution of Mixed Signals

The signal demixing problem seeks to separate the superposition of multi...

Please sign up or login with your details

Forgot password? Click here to reset