Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL

08/31/2020
by   Xiaoyu Chen, et al.
9

Reinforcement learning (RL) in episodic, factored Markov decision processes (FMDPs) is studied. We propose an algorithm called FMDP-BF, which leverages the factorization structure of FMDP. The regret of FMDP-BF is shown to be exponentially smaller than that of optimal algorithms designed for non-factored MDPs, and improves on the best previous result for FMDPs <cit.> by a factored of √(H|𝒮_i|), where |𝒮_i| is the cardinality of the factored state subspace and H is the planning horizon. To show the optimality of our bounds, we also provide a lower bound for FMDP, which indicates that our algorithm is near-optimal w.r.t. timestep T, horizon H and factored state-action subspace cardinality. Finally, as an application, we study a new formulation of constrained RL, known as RL with knapsack constraints (RLwK), and provides the first sample-efficient algorithm based on FMDP-BF.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2020

Regret Bounds for Discounted MDPs

Recently, it has been shown that carefully designed reinforcement learni...
research
09/13/2020

Oracle-Efficient Reinforcement Learning in Factored MDPs with Unknown Structure

We consider provably-efficient reinforcement learning (RL) in non-episod...
research
02/12/2020

A Tensor Network Approach to Finite Markov Decision Processes

Tensor network (TN) techniques - often used in the context of quantum ma...
research
02/14/2022

Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality

Deployment efficiency is an important criterion for many real-world appl...
research
07/19/2021

Provably Efficient Multi-Task Reinforcement Learning with Model Transfer

We study multi-task reinforcement learning (RL) in tabular episodic Mark...
research
08/29/2022

Categorical semantics of compositional reinforcement learning

Reinforcement learning (RL) often requires decomposing a problem into su...
research
05/23/2022

Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs

Recent studies have shown that episodic reinforcement learning (RL) is n...

Please sign up or login with your details

Forgot password? Click here to reset