Computational Benefits of Intermediate Rewards for Hierarchical Planning

07/08/2021
by   Yuexiang Zhai, et al.
0

Many hierarchical reinforcement learning (RL) applications have empirically verified that incorporating prior knowledge in reward design improves convergence speed and practical performance. We attempt to quantify the computational benefits of hierarchical RL from a planning perspective under assumptions about the intermediate state and intermediate rewards frequently (but often implicitly) adopted in practice. Our approach reveals a trade-off between computational complexity and the pursuit of the shortest path in hierarchical planning: using intermediate rewards significantly reduces the computational complexity in finding a successful policy but does not guarantee to find the shortest path, whereas using sparse terminal rewards finds the shortest path at a significantly higher computational cost. We also corroborate our theoretical results with extensive experiments on the MiniGrid environments using Q-learning and other popular deep RL algorithms.

READ FULL TEXT
research
11/07/2019

A Hierarchical Optimizer for Recommendation System Based on Shortest Path Algorithm

Top-k Nearest Geosocial Keyword (T-kNGK) query on geosocial network is d...
research
03/13/2018

Lazy Receding Horizon A* for Efficient Path Planning in Graphs with Expensive-to-Evaluate Edges

Motion-planning problems, such as manipulation in cluttered environments...
research
07/13/2021

Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on ...
research
11/03/2020

Deep Reinforcement Learning Based Dynamic Route Planning for Minimizing Travel Time

Route planning is important in transportation. Existing works focus on f...
research
04/14/2022

A* shortest string decoding for non-idempotent semirings

The single shortest path algorithm is undefined for weighted finite-stat...
research
03/15/2012

Variance-Based Rewards for Approximate Bayesian Reinforcement Learning

The exploreexploit dilemma is one of the central challenges in Reinforce...
research
06/28/2018

Hierarchical Reinforcement Learning with Abductive Planning

One of the key challenges in applying reinforcement learning to real-lif...

Please sign up or login with your details

Forgot password? Click here to reset