Optimizing Simulations with Noise-Tolerant Structured Exploration

05/20/2018
by   Krzysztof Choromanski, et al.
0

We propose a simple drop-in noise-tolerant replacement for the standard finite difference procedure used ubiquitously in blackbox optimization. In our approach, parameter perturbation directions are defined by a family of structured orthogonal matrices. We show that at the small cost of computing a Fast Walsh-Hadamard/Fourier Transform (FWHT/FFT), such structured finite differences consistently give higher quality approximation of gradients and Jacobians in comparison to vanilla approaches that use coordinate directions or random Gaussian perturbations. We find that trajectory optimizers like Iterative LQR and Differential Dynamic Programming require fewer iterations to solve several classic continuous control tasks when our methods are used to linearize noisy, blackbox dynamics instead of standard finite differences. By embedding structured exploration in a quasi-Newton optimizer (LBFGS), we are able to learn agile walking and turning policies for quadruped locomotion, that successfully transfer from simulation to actual hardware.We theoretically justify our methods via bounds on the quality of gradient reconstruction and provide a basis for applying them also to nonsmooth problems.

READ FULL TEXT

page 1

page 7

research
03/06/2023

Improved Exploration for Safety-Embedded Differential Dynamic Programming Using Tolerant Barrier States

In this paper, we introduce Tolerant Discrete Barrier States (T-DBaS), a...
research
04/22/2021

Orthogonal iterations on Structured Pencils

We present a class of fast subspace tracking algorithms based on orthogo...
research
04/06/2018

Structured Evolution with Compact Architectures for Scalable Policy Optimization

We present a new method of blackbox optimization via gradient approximat...
research
10/15/2020

Dynamic Walking: Toward Agile and Efficient Bipedal Robots

Dynamic walking on bipedal robots has evolved from an idea in science fi...
research
05/18/2023

On iterative methods based on Sherman-Morrison-Woodbury regular splitting

We consider a regular splitting based on the Sherman-Morrison-Woodbury f...
research
03/06/2020

The linearization methods as a basis to derive the relaxation and the shooting methods

This chapter investigates numerical solution of nonlinear two-point boun...

Please sign up or login with your details

Forgot password? Click here to reset