Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

03/25/2021
by   Yaqi Duan, et al.
1

This paper considers batch Reinforcement Learning (RL) with general value function approximation. Our study investigates the minimal assumptions to reliably estimate/minimize Bellman error, and characterizes the generalization performance by (local) Rademacher complexities of general function classes, which makes initial steps in bridging the gap between statistical learning theory and batch RL. Concretely, we view the Bellman error as a surrogate loss for the optimality gap, and prove the followings: (1) In double sampling regime, the excess risk of Empirical Risk Minimizer (ERM) is bounded by the Rademacher complexity of the function class. (2) In the single sampling regime, sample-efficient risk minimization is not possible without further assumptions, regardless of algorithms. However, with completeness assumptions, the excess risk of FQI and a minimax style algorithm can be again bounded by the Rademacher complexity of the corresponding function classes. (3) Fast statistical rates can be achieved by using tools of local Rademacher complexity. Our analysis covers a wide range of function classes, including finite classes, linear spaces, kernel spaces, sparse linear features, etc.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2020

Provably Efficient Reinforcement Learning with General Value Function Approximation

Value function approximation has demonstrated phenomenal empirical succe...
research
05/01/2019

Information-Theoretic Considerations in Batch Reinforcement Learning

Value-function approximation methods that operate in batch mode have fou...
research
02/22/2023

Provably Efficient Reinforcement Learning via Surprise Bound

Value function approximation is important in modern reinforcement learni...
research
11/09/2020

On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces

The classical theory of reinforcement learning (RL) has focused on tabul...
research
07/06/2023

Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation

Risk-sensitive reinforcement learning (RL) aims to optimize policies tha...
research
05/12/2014

Structural Return Maximization for Reinforcement Learning

Batch Reinforcement Learning (RL) algorithms attempt to choose a policy ...
research
03/09/2020

Q^ Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison

We prove performance guarantees of two algorithms for approximating Q^ i...

Please sign up or login with your details

Forgot password? Click here to reset