A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

09/30/2022
by   Zixiang Chen, et al.
13

With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL). In this paper, we propose a general framework that unifies model-based and model-free RL, and an Admissible Bellman Characterization (ABC) class that subsumes nearly all Markov Decision Process (MDP) models in the literature for tractable RL. We propose a novel estimation function with decomposable structural properties for optimization-based exploration and the functional eluder dimension as a complexity measure of the ABC class. Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed, achieving regret bounds that match or improve over the best-known results for a variety of MDP models. In particular, for MDPs with low Witness rank, under a slightly stronger assumption, OPERA improves the state-of-the-art sample complexity results by a factor of dH. Our framework provides a generic interface to design and analyze new RL models and algorithms.

READ FULL TEXT
research
11/21/2018

Model-Based Reinforcement Learning in Contextual Decision Processes

We study the sample complexity of model-based reinforcement learning in ...
research
12/06/2019

Observational Overfitting in Reinforcement Learning

A major component of overfitting in model-free reinforcement learning (R...
research
03/19/2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL

This work introduces Bilinear Classes, a new structural framework, which...
research
05/29/2023

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

In online reinforcement learning (online RL), balancing exploration and ...
research
10/19/2016

A Reinforcement Learning Approach to the View Planning Problem

We present a Reinforcement Learning (RL) solution to the view planning p...
research
03/06/2021

Causal Reinforcement Learning: An Instrumental Variable Approach

In the standard data analysis framework, data is first collected (once f...
research
02/01/2021

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms

Finding the minimal structural assumptions that empower sample-efficient...

Please sign up or login with your details

Forgot password? Click here to reset