A Constrained Randomized Shortest-Paths Framework for Optimal Exploration

07/12/2018
by   Bertrand Lebichot, et al.
0

The present work extends the randomized shortest-paths framework (RSP), interpolating between shortest-path and random-walk routing in a network, in three directions. First, it shows how to deal with equality constraints on a subset of transition probabilities and develops a generic algorithm for solving this constrained RSP problem using Lagrangian duality. Second, it derives a surprisingly simple iterative procedure to compute the optimal, randomized, routing policy generalizing the previously developed "soft" Bellman-Ford algorithm. The resulting algorithm allows balancing exploitation and exploration in an optimal way by interpolating between a pure random behavior and the deterministic, optimal, policy (least-cost paths) while satisfying the constraints. Finally, the two algorithms are applied to Markov decision problems by considering the process as a constrained RSP on a bipartite state-action graph. In this context, the derived "soft" value iteration algorithm appears to be closely related to dynamic policy programming as well as Kullback-Leibler and path integral control, and similar to a recently introduced reinforcement learning exploration strategy. This shows that this strategy is optimal in the RSP sense - it minimizes expected path cost subject to relative entropy constraint. Simulation results on illustrative examples show that the model behaves as expected.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2019

Randomized Shortest Paths with Net Flows and Capacity Constraints

This work extends the randomized shortest paths model (RSP) by investiga...
research
07/01/2020

Sparse Randomized Shortest Paths Routing with Tsallis Divergence Regularization

This work elaborates on the important problem of (1) designing optimal r...
research
08/23/2021

Relative Entropy-Regularized Optimal Transport on a Graph: a new algorithm and an experimental comparison

Following [21, 23], the present work investigates a new relative entropy...
research
03/01/2019

Bounded Dijkstra (BD): Search Space Reduction for Expediting Shortest Path Subroutines

The shortest path (SP) and shortest paths tree (SPT) problems arise both...
research
07/13/2021

Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on ...
research
06/07/2018

Randomized Optimal Transport on a Graph: Framework and New Distance Measures

The recently developed bag-of-paths framework consists in setting a Gibb...
research
09/19/2016

Incremental Sampling-based Motion Planners Using Policy Iteration Methods

Recent progress in randomized motion planners has led to the development...

Please sign up or login with your details

Forgot password? Click here to reset