Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

10/29/2016
by   Nan Jiang, et al.
0

This paper studies systematic exploration for reinforcement learning with rich observations and function approximation. We introduce a new model called contextual decision processes, that unifies and generalizes most prior settings. Our first contribution is a complexity measure, the Bellman rank, that we show enables tractable learning of near-optimal behavior in these processes and is naturally small for many well-studied reinforcement learning settings. Our second contribution is a new reinforcement learning algorithm that engages in systematic exploration to learn contextual decision processes with low Bellman rank. Our algorithm provably learns near-optimal behavior with a number of samples that is polynomial in all relevant parameters but independent of the number of unique observations. The approach uses Bellman error minimization with optimistic exploration and provides new insights into efficient exploration for reinforcement learning with function approximation.

READ FULL TEXT
research
02/08/2016

PAC Reinforcement Learning with Rich Observations

We propose and study a new model for reinforcement learning with rich ob...
research
05/14/2021

Efficient PAC Reinforcement Learning in Regular Decision Processes

Recently regular decision processes have been proposed as a well-behaved...
research
03/15/2020

Provably Efficient Exploration for RL with Unsupervised Learning

We study how to use unsupervised learning for efficient exploration in r...
research
07/16/2018

Shielded Decision-Making in MDPs

A prominent problem in artificial intelligence and machine learning is t...
research
11/01/2019

Explicit Explore-Exploit Algorithms in Continuous State Spaces

We present a new model-based algorithm for reinforcement learning (RL) w...
research
11/21/2018

Model-Based Reinforcement Learning in Contextual Decision Processes

We study the sample complexity of model-based reinforcement learning in ...
research
07/07/2020

Sharp Analysis of Smoothed Bellman Error Embedding

The Smoothed Bellman Error Embedding algorithm <cit.>, known as SBEED, w...

Please sign up or login with your details

Forgot password? Click here to reset