Sharp Analysis of Smoothed Bellman Error Embedding

07/07/2020
by   Ahmed Touati, et al.
0

The Smoothed Bellman Error Embedding algorithm <cit.>, known as SBEED, was proposed as a provably convergent reinforcement learning algorithm with general nonlinear function approximation. It has been successfully implemented with neural networks and achieved strong empirical results. In this work, we study the theoretical behavior of SBEED in batch-mode reinforcement learning. We prove a near-optimal performance guarantee that depends on the representation power of the used function classes and a tight notion of the distribution shift. Our results improve upon prior guarantees for SBEED in  <cit.> in terms of the dependence on the planning horizon and on the sample size. Our analysis builds on the recent work of  <cit.> which studies a related algorithm MSBO, that could be interpreted as a non-smooth counterpart of SBEED.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2019

Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle

Q-learning with function approximation is one of the most popular method...
research
09/08/2022

Strategyproof Scheduling with Predictions

In their seminal paper that initiated the field of algorithmic mechanism...
research
06/18/2020

Provably adaptive reinforcement learning in metric spaces

We study reinforcement learning in continuous state and action spaces en...
research
10/02/2020

Improved Submodular Secretary Problem with Shortlists

First, for the for the submodular k-secretary problem with shortlists [1...
research
03/09/2020

Q^ Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison

We prove performance guarantees of two algorithms for approximating Q^ i...
research
10/29/2016

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

This paper studies systematic exploration for reinforcement learning wit...
research
04/08/2022

Individually-Fair Auctions for Multi-Slot Sponsored Search

We design fair sponsored search auctions that achieve a near-optimal tra...

Please sign up or login with your details

Forgot password? Click here to reset