Provably Efficient Reinforcement Learning with General Value Function Approximation

05/21/2020
by   Ruosong Wang, et al.
13

Value function approximation has demonstrated phenomenal empirical success in reinforcement learning (RL). Nevertheless, despite a handful of recent progress on developing theory for RL with linear function approximation, the understanding of general function approximation schemes largely remains missing. In this paper, we establish the first provable efficiently RL algorithm with general value function approximation. In particular, we show that if the value functions admit an approximation with a function class F, our algorithm achieves a regret bound of O(poly(dH)√(T)) where d is a complexity measure of F, H is the planning horizon, and T is the number interactions with the environment. Our theory strictly generalizes recent progress on RL with linear function approximation and does not make explicit assumptions on the model of the environment. Moreover, our algorithm is model-free and provides a framework to justify algorithms used in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2023

Provably Efficient Reinforcement Learning via Surprise Bound

Value function approximation is important in modern reinforcement learni...
research
05/01/2019

Information-Theoretic Considerations in Batch Reinforcement Learning

Value-function approximation methods that operate in batch mode have fou...
research
08/28/2019

Reinforcement Learning: Prediction, Control and Value Function Approximation

With the increasing power of computers and the rapid development of self...
research
10/30/2010

Predictive State Temporal Difference Learning

We propose a new approach to value function approximation which combines...
research
07/18/2013

Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization

We consider the problem of reinforcement learning over episodes of a fin...
research
03/25/2021

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

This paper considers batch Reinforcement Learning (RL) with general valu...
research
10/28/2020

Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning

Despite empirical success, the theory of reinforcement learning (RL) wit...

Please sign up or login with your details

Forgot password? Click here to reset