DiSProD: Differentiable Symbolic Propagation of Distributions for Planning

02/03/2023
by   Palash Chatterjee, et al.
0

The paper introduces DiSProD, an online planner developed for environments with probabilistic transitions in continuous state and action spaces. DiSProD builds a symbolic graph that captures the distribution of future trajectories, conditioned on a given policy, using independence assumptions and approximate propagation of distributions. The symbolic graph provides a differentiable representation of the policy's value, enabling efficient gradient-based optimization for long-horizon search. The propagation of approximate distributions can be seen as an aggregation of many trajectories, making it well-suited for dealing with sparse rewards and stochastic environments. An extensive experimental evaluation compares DiSProD to state-of-the-art planners in discrete-time planning and real-time control of robotic systems. The proposed method improves over existing planners in handling stochastic environments, sensitivity to search depth, sparsity of rewards, and large action spaces. Additional real-world experiments demonstrate that DiSProD can control ground vehicles and surface vessels to successfully navigate around obstacles.

READ FULL TEXT

page 9

page 19

research
10/18/2021

Probabilistic Inference in Planning for Partially Observable Long Horizon Problems

For autonomous service robots to successfully perform long horizon tasks...
research
01/29/2021

Moment-Based Exact Uncertainty Propagation Through Nonlinear Stochastic Autonomous Systems

In this paper, we address the problem of uncertainty propagation through...
research
12/23/2022

Online Planning for Constrained POMDPs with Continuous Spaces through Dual Ascent

Rather than augmenting rewards with penalties for undesired behavior, Co...
research
10/04/2022

Continuous Monte Carlo Graph Search

In many complex sequential decision making tasks, online planning is cru...
research
05/19/2017

Model-Based Planning in Discrete Action Spaces

Planning actions using learned and differentiable forward models of the ...
research
06/07/2023

K-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control

We propose a novel K-nearest neighbor resampling procedure for estimatin...

Please sign up or login with your details

Forgot password? Click here to reset