Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning

11/01/2021
by   Yuanzhi Li, et al.
0

Recently there is a surge of interest in understanding the horizon-dependence of the sample complexity in reinforcement learning (RL). Notably, for an RL environment with horizon length H, previous work have shown that there is a probably approximately correct (PAC) algorithm that learns an O(1)-optimal policy using polylog(H) episodes of environment interactions when the number of states and actions is fixed. It is yet unknown whether the polylog(H) dependence is necessary or not. In this work, we resolve this question by developing an algorithm that achieves the same PAC guarantee while using only O(1) episodes of environment interactions, completely settling the horizon-dependence of the sample complexity in RL. We achieve this bound by (i) establishing a connection between value functions in discounted and finite-horizon Markov decision processes (MDPs) and (ii) a novel perturbation analysis in MDPs. We believe our new techniques are of independent interest and could be applied in related questions in RL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2015

Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning

Recently, there has been significant progress in understanding reinforce...
research
02/15/2023

Optimal Sample Complexity of Reinforcement Learning for Uniformly Ergodic Discounted Markov Decision Processes

We consider the optimal sample complexity theory of tabular reinforcemen...
research
09/05/2020

A Hybrid PAC Reinforcement Learning Algorithm

This paper offers a new hybrid probably asymptotically correct (PAC) rei...
research
09/23/2020

A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with Constraints

Constrained Markov Decision Processes (CMDPs) formalize sequential decis...
research
11/24/2021

Reinforcement Learning for General LTL Objectives Is Intractable

In recent years, researchers have made significant progress in devising ...
research
05/01/2020

Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?

Learning to plan for long horizons is a central challenge in episodic re...
research
04/19/2023

Bridging RL Theory and Practice with the Effective Horizon

Deep reinforcement learning (RL) works impressively in some environments...

Please sign up or login with your details

Forgot password? Click here to reset