Is Q-Learning Provably Efficient? An Extended Analysis

09/22/2020
by   Kushagra Rastogi, et al.
40

This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by Jin et al. We include a survey of related research to contextualize the need for strengthening the theoretical guarantees related to perhaps the most important threads of model-free reinforcement learning. We also expound upon the reasoning used in the proofs to highlight the critical steps leading to the main result showing that Q-learning with UCB exploration achieves a sample efficiency that matches the optimal regret that can be achieved by any model-based approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/10/2018

Is Q-learning Provably Efficient?

Model-free reinforcement learning (RL) algorithms, such as Q-learning, d...
research
04/21/2020

Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition

We study the reinforcement learning problem in the setting of finite-hor...
research
01/27/2019

Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP

A fundamental question in reinforcement learning is whether model-free a...
research
03/16/2017

Minimax Regret Bounds for Reinforcement Learning

We consider the problem of provably optimal exploration in reinforcement...
research
09/20/2022

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL

The cooperative Multi-A gent R einforcement Learning (MARL) with permuta...
research
03/07/2021

The Effect of Q-function Reuse on the Total Regret of Tabular, Model-Free, Reinforcement Learning

Some reinforcement learning methods suffer from high sample complexity c...
research
02/18/2023

Efficient exploration via epistemic-risk-seeking policy optimization

Exploration remains a key challenge in deep reinforcement learning (RL)....

Please sign up or login with your details

Forgot password? Click here to reset