Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single

04/21/2023
by   Paul Vicol, et al.
0

We propose an evolution strategies-based algorithm for estimating gradients in unrolled computation graphs, called ES-Single. Similarly to the recently-proposed Persistent Evolution Strategies (PES), ES-Single is unbiased, and overcomes chaos arising from recursive function applications by smoothing the meta-loss landscape. ES-Single samples a single perturbation per particle, that is kept fixed over the course of an inner problem (e.g., perturbations are not re-sampled for each partial unroll). Compared to PES, ES-Single is simpler to implement and has lower variance: the variance of ES-Single is constant with respect to the number of truncated unrolls, removing a key barrier in applying ES to long inner problems using short truncations. We show that ES-Single is unbiased for quadratic inner problems, and demonstrate empirically that its variance can be substantially lower than that of PES. ES-Single consistently outperforms PES on a variety of tasks, including a synthetic benchmark task, hyperparameter optimization, training recurrent neural networks, and training learned optimizers.

READ FULL TEXT

page 7

page 14

page 18

page 21

page 22

page 23

page 33

page 34

research
12/27/2021

Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies

Unrolled computation graphs arise in many scenarios, including training ...
research
09/22/2022

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

Meta-gradients provide a general approach for optimizing the meta-parame...
research
04/21/2023

Noise-Reuse in Online Evolution Strategies

Online evolution strategies have become an attractive alternative to aut...
research
02/06/2019

On the Variance of Unbiased Online Recurrent Optimization

The recently proposed Unbiased Online Recurrent Optimization algorithm (...
research
01/07/2019

Credit Assignment Techniques in Stochastic Computation Graphs

Stochastic computation graphs (SCGs) provide a formalism to represent st...
research
11/20/2017

Unbiased Simulation for Optimizing Stochastic Function Compositions

In this paper, we introduce an unbiased gradient simulation algorithms f...
research
03/12/2018

Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches

Stochastic neural net weights are used in a variety of contexts, includi...

Please sign up or login with your details

Forgot password? Click here to reset