Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions

10/03/2020
by   Gellért Weisz, et al.
0

We consider the problem of local planning in fixed-horizon Markov Decision Processes (MDPs) with linear function approximation and a generative model under the assumption that the optimal action-value function lies in the span of a feature map that is available to the planner. Previous work has left open the question of whether there exists sound planners that need only poly(H, d) queries regardless of the MDP, where H is the horizon and d is the dimensionality of the features. We answer this question in the negative: we show that any sound planner must query at least min(exp(Ω(d)), Ω(2^H)) samples. We also show that for any δ>0, the least-squares value iteration algorithm with O(H^5d^(H+1)/δ^2) queries can compute a δ-optimal policy. We discuss implications and remaining open questions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/03/2021

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

We consider the problem of local planning in fixed-horizon Markov Decisi...
research
10/05/2021

TensorPlan and the Few Actions Lower Bound for Planning in MDPs under Linear Realizability of Optimal Value Functions

We consider the minimax query complexity of online planning with a gener...
research
05/07/2018

Planning and Learning with Stochastic Action Sets

In many practical uses of reinforcement learning (RL) the set of actions...
research
05/22/2023

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

Mirror descent value iteration (MDVI), an abstraction of Kullback-Leible...
research
10/27/2022

Confident Approximate Policy Iteration for Efficient Local Planning in q^π-realizable MDPs

We consider approximate dynamic programming in γ-discounted Markov decis...
research
10/21/2022

Efficient Global Planning in Large MDPs via Stochastic Primal-Dual Optimization

We propose a new stochastic primal-dual optimization algorithm for plann...
research
09/09/2011

mGPT: A Probabilistic Planner Based on Heuristic Search

We describe the version of the GPT planner used in the probabilistic tra...

Please sign up or login with your details

Forgot password? Click here to reset