Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

05/17/2021
by   Gen Li, et al.
6

Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning (RL). The current paper pertains to a scenario with value-based linear representation, which postulates the linear realizability of the optimal Q-function (also called the "linear Q^⋆ problem"). While linear realizability alone does not allow for sample-efficient solutions in general, the presence of a large sub-optimality gap is a potential game changer, depending on the sampling mechanism in use. Informally, sample efficiency is achievable with a large sub-optimality gap when a generative model is available but is unfortunately infeasible when we turn to standard online RL settings. In this paper, we make progress towards understanding this linear Q^⋆ problem by investigating a new sampling protocol, which draws samples in an online/exploratory fashion but allows one to backtrack and revisit previous states in a controlled and infrequent manner. This protocol is more flexible than the standard online RL setting, while being practically relevant and far more restrictive than the generative model. We develop an algorithm tailored to this setting, achieving a sample complexity that scales polynomially with the feature dimension, the horizon, and the inverse sub-optimality gap, but not the size of the state/action space. Our findings underscore the fundamental interplay between sampling protocols and low-complexity structural representation in RL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2021

An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap

A fundamental question in the theory of reinforcement learning is: suppo...
research
06/15/2021

On the Power of Multitask Representation Learning in Linear MDP

While multitask representation learning has become a popular approach in...
research
11/14/2022

Linear Reinforcement Learning with Ball Structure Action Space

We study the problem of Reinforcement Learning (RL) with linear function...
research
10/12/2020

Is Plug-in Solver Sample-Efficient for Feature-based Reinforcement Learning?

It is believed that a model-based approach for reinforcement learning (R...
research
02/01/2021

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms

Finding the minimal structural assumptions that empower sample-efficient...
research
03/19/2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL

This work introduces Bilinear Classes, a new structural framework, which...
research
02/14/2022

Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality

Deployment efficiency is an important criterion for many real-world appl...

Please sign up or login with your details

Forgot password? Click here to reset