Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

06/29/2021
by   Guillermo Infante, et al.
0

In this work we present a novel approach to hierarchical reinforcement learning for linearly-solvable Markov decision processes. Our approach assumes that the state space is partitioned, and the subtasks consist in moving between the partitions. We represent value functions on several levels of abstraction, and use the compositionality of subtasks to estimate the optimal values of the states in each partition. The policy is implicitly defined on these optimal value estimates, rather than being decomposed among the subtasks. As a consequence, our approach can learn the globally optimal policy, and does not suffer from the non-stationarity of high-level decisions. If several partitions have equivalent dynamics, the subtasks of those partitions can be shared. If the set of boundary states is smaller than the entire state space, our approach can have significantly smaller sample complexity than that of a flat learner, and we validate this empirically in several experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2019

Detecting Spiky Corruption in Markov Decision Processes

Current reinforcement learning methods fail if the reward function is im...
research
06/03/2021

Hierarchical Representation Learning for Markov Decision Processes

In this paper we present a novel method for learning hierarchical repres...
research
03/10/2016

Hierarchical Linearly-Solvable Markov Decision Problems

We present a hierarchical reinforcement learning framework that formulat...
research
02/15/2023

Optimal Sample Complexity of Reinforcement Learning for Uniformly Ergodic Discounted Markov Decision Processes

We consider the optimal sample complexity theory of tabular reinforcemen...
research
10/15/2020

Optimal Dispatch in Emergency Service System via Reinforcement Learning

In the United States, medical responses by fire departments over the las...
research
09/22/2019

Faster saddle-point optimization for solving large-scale Markov decision processes

We consider the problem of computing optimal policies in average-reward ...
research
12/08/2016

Hierarchy through Composition with Linearly Solvable Markov Decision Processes

Hierarchical architectures are critical to the scalability of reinforcem...

Please sign up or login with your details

Forgot password? Click here to reset