Direct Policy Optimization using Deterministic Sampling and Collocation

10/16/2020
by   Taylor A. Howell, et al.
0

We present an approach for approximately solving discrete-time stochastic optimal control problems by combining direct trajectory optimization, deterministic sampling, and policy optimization. Our feedback motion planning algorithm uses a quasi-Newton method to simultaneously optimize a nominal trajectory, a set of deterministically chosen sample trajectories, and a parameterized policy. We demonstrate that this approach exactly recovers LQR policies in the case of linear dynamics, quadratic cost, and Gaussian noise. We also demonstrate the algorithm on several nonlinear, underactuated robotic systems to highlight its performance and ability to handle control limits, safely avoid obstacles, and generate robust plans in the presence of unmodeled dynamics.

READ FULL TEXT

page 1

page 7

research
11/29/2018

Structure-preserving constrained optimal trajectory planning of a wheeled inverted pendulum

The Wheeled Inverted Pendulum (WIP) is an underactuated, nonholonomic me...
research
06/05/2021

Trajectory Optimization of Chance-Constrained Nonlinear Stochastic Systems for Motion Planning and Control

We present gPC-SCP: Generalized Polynomial Chaos-based Sequential Convex...
research
10/13/2021

Guided Policy Search using Sequential Convex Programming for Initialization of Trajectory Optimization Algorithms

Nonlinear trajectory optimization algorithms have been developed to hand...
research
03/29/2021

Distributionally Robust Trajectory Optimization Under Uncertain Dynamics via Relative-Entropy Trust Regions

Trajectory optimization and model predictive control are essential techn...
research
03/18/2022

Sampling Complexity of Path Integral Methods for Trajectory Optimization

The use of random sampling in decision-making and control has become pop...
research
10/06/2021

Entropy Regularised Deterministic Optimal Control: From Path Integral Solution to Sample-Based Trajectory Optimisation

Sample-based trajectory optimisers are a promising tool for the control ...
research
05/04/2021

Operator Splitting for Adaptive Radiation Therapy with Nonlinear Health Dynamics

We present an optimization-based approach to radiation treatment plannin...

Please sign up or login with your details

Forgot password? Click here to reset