-
Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems
This paper addresses the problem of learning the optimal control policy ...
read it
-
Hamilton-Jacobi-Bellman Equations for Q-Learning in Continuous Time
In this paper, we introduce Hamilton-Jacobi-Bellman (HJB) equations for ...
read it
-
Dual Sequential Monte Carlo: Tunneling Filtering and Planning in Continuous POMDPs
We present the DualSMC network that solves continuous POMDPs by learning...
read it
-
PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning
Autonomous agents are limited in their ability to observe the world stat...
read it
-
Uncertainty-Constrained Differential Dynamic Programming in Belief Space for Vision Based Robots
Most mobile robots follow a modular sense-planact system architecture th...
read it
-
Adaptive CVaR Optimization for Dynamical Systems with Path Space Stochastic Search
We present a general framework for optimizing the Conditional Value-at-R...
read it
-
Smoothing-Averse Control: Covertness and Privacy from Smoothers
In this paper we investigate the problem of controlling a partially obse...
read it
SACBP: Belief Space Planning for Continuous-Time Dynamical Systems via Stochastic Sequential Action Control
We propose a novel belief space planning technique for continuous dynamics by viewing the belief system as a hybrid dynamical system with time-driven switching. Our approach is based on the perturbation theory of differential equations and extends Sequential Action Control to stochastic belief dynamics. The resulting algorithm, which we name SACBP, does not require discretization of spaces or time and synthesizes control signals in near real-time. SACBP is an anytime algorithm that can handle general parametric Bayesian filters under certain assumptions. We demonstrate the effectiveness of our approach in an active sensing scenario and a model-based Bayesian reinforcement learning problem. In these challenging problems, we show that the algorithm significantly outperforms other existing solution techniques including approximate dynamic programming and local trajectory optimization.
READ FULL TEXT
Comments
There are no comments yet.