Adaptive Smoothing Path Integral Control

05/13/2020
by   Dominik Thalmeier, et al.
0

In Path Integral control problems a representation of an optimally controlled dynamical system can be formally computed and serve as a guidepost to learn a parametrized policy. The Path Integral Cross-Entropy (PICE) method tries to exploit this, but is hampered by poor sample efficiency. We propose a model-free algorithm called ASPIC (Adaptive Smoothing of Path Integral Control) that applies an inf-convolution to the cost function to speedup convergence of policy optimization. We identify PICE as the infinite smoothing limit of such technique and show that the sample efficiency problems that PICE suffers disappear for finite levels of smoothing. For zero smoothing this method becomes a greedy optimization of the cost, which is the standard approach in current reinforcement learning. We show analytically and empirically that intermediate levels of smoothing are optimal, which renders the new method superior to both PICE and direct cost-optimization.

READ FULL TEXT
research
12/29/2017

Smoothed Dual Embedding Control

We revisit the Bellman optimality equation with Nesterov's smoothing tec...
research
12/18/2021

Derivative Action Control: Smooth Model Predictive Path Integral Control without Smoothing

Here, we present a new approach to generate smooth control sequences in ...
research
08/13/2012

Path Integral Control by Reproducing Kernel Hilbert Space Embedding

We present an embedding of stochastic optimal control problems, of the s...
research
06/29/2017

Path Integral Networks: End-to-End Differentiable Optimal Control

In this paper, we introduce Path Integral Networks (PI-Net), a recurrent...
research
06/04/2020

Model-Based Generalization Under Parameter Uncertainty Using Path Integral Control

This work addresses the problem of robot interaction in complex environm...
research
09/24/2021

Smoothing splines approximation using Hilbert curve basis selection

Smoothing splines have been used pervasively in nonparametric regression...
research
07/30/2015

Agglomerative clustering and collectiveness measure via exponent generating function

The key in agglomerative clustering is to define the affinity measure be...

Please sign up or login with your details

Forgot password? Click here to reset