Is Plug-in Solver Sample-Efficient for Feature-based Reinforcement Learning?

10/12/2020
by   Qiwen Cui, et al.
0

It is believed that a model-based approach for reinforcement learning (RL) is the key to reduce sample complexity. However, the understanding of the sample optimality of model-based RL is still largely missing, even for the linear case. This work considers sample complexity of finding an ϵ-optimal policy in a Markov decision process (MDP) that admits a linear additive feature representation, given only access to a generative model. We solve this problem via a plug-in solver approach, which builds an empirical model and plans in this empirical model via an arbitrary plug-in solver. We prove that under the anchor-state assumption, which implies implicit non-negativity in the feature space, the minimax sample complexity of finding an ϵ-optimal policy in a γ-discounted MDP is O(K/(1-γ)^3ϵ^2), which only depends on the dimensionality K of the feature space and has no dependence on the state or action space. We further extend our results to a relaxed setting where anchor-states may not exist and show that a plug-in approach can be sample efficient as well, providing a flexible approach to design model-based algorithms for RL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2021

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

The curse of dimensionality is a widely known issue in reinforcement lea...
research
11/14/2022

Linear Reinforcement Learning with Ball Structure Action Space

We study the problem of Reinforcement Learning (RL) with linear function...
research
02/13/2019

Sample-Optimal Parametric Q-Learning with Linear Transition Models

Consider a Markov decision process (MDP) that admits a set of state-acti...
research
06/20/2022

Policy Optimization with Linear Temporal Logic Constraints

We study the problem of policy optimization (PO) with linear temporal lo...
research
11/28/2022

Inapplicable Actions Learning for Knowledge Transfer in Reinforcement Learning

Reinforcement Learning (RL) algorithms are known to scale poorly to envi...
research
05/17/2021

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Low-complexity models such as linear function representation play a pivo...
research
02/10/2023

Towards Minimax Optimality of Model-based Robust Reinforcement Learning

We study the sample complexity of obtaining an ϵ-optimal policy in Robus...

Please sign up or login with your details

Forgot password? Click here to reset