Variance-reduced Q-learning is minimax optimal

06/11/2019
by   Martin J. Wainwright, et al.
0

We introduce and analyze a form of variance-reduced Q-learning. For γ-discounted MDPs with finite state space X and action space U, we prove that it yields an ϵ-accurate estimate of the optimal Q-function in the ℓ_∞-norm using O((D/ϵ^2 (1-γ)^3) ( D/(1-γ)) ) samples, where D = |X| × |U|. This guarantee matches known minimax lower bounds up to a logarithmic factor in the discount complexity, and is the first form of model-free Q-learning proven to achieve the worst-case optimal cubic scaling in the discount complexity parameter 1/(1-γ) accompanied by optimal linear scaling in the state and action space sizes. By contrast, our past work shows that ordinary Q-learning has worst-case quartic scaling in the discount complexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2021

Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning

Various algorithms in reinforcement learning exhibit dramatic variabilit...
research
06/11/2020

On Worst-case Regret of Linear Thompson Sampling

In this paper, we consider the worst-case regret of Linear Thompson Samp...
research
06/04/2020

Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction

Asynchronous Q-learning aims to learn the optimal action-value function ...
research
02/12/2021

Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis

Q-learning, which seeks to learn the optimal Q-function of a Markov deci...
research
09/24/2021

Optimal policy evaluation using kernel-based temporal difference methods

We study methods based on reproducing kernel Hilbert spaces for estimati...
research
06/19/2020

Minimax rates without the fixed sample size assumption

We generalize the notion of minimax convergence rate. In contrast to the...
research
03/18/2021

Comparative Design-Choice Analysis of Color Refinement Algorithms Beyond the Worst Case

Color refinement is a crucial subroutine in symmetry detection in theory...

Please sign up or login with your details

Forgot password? Click here to reset