Black Box Lie Group Preconditioners for SGD

11/08/2022
by   Xilin Li, et al.
0

A matrix free and a low rank approximation preconditioner are proposed to accelerate the convergence of stochastic gradient descent (SGD) by exploiting curvature information sampled from Hessian-vector products or finite differences of parameters and gradients similar to the BFGS algorithm. Both preconditioners are fitted with an online updating manner minimizing a criterion that is free of line search and robust to stochastic gradient noise, and further constrained to be on certain connected Lie groups to preserve their corresponding symmetry or invariance, e.g., orientation of coordinates by the connected general linear group with positive determinants. The Lie group's equivariance property facilitates preconditioner fitting, and its invariance property saves any need of damping, which is common in second-order optimizers, but difficult to tune. The learning rate for parameter updating and step size for preconditioner fitting are naturally normalized, and their default values work well in most situations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2021

Towards Noise-adaptive, Problem-adaptive Stochastic Gradient Descent

We design step-size schemes that make stochastic gradient descent (SGD) ...
research
07/16/2019

SGD momentum optimizer with step estimation by online parabola model

In stochastic gradient descent, especially for neural network training, ...
research
12/14/2015

Preconditioned Stochastic Gradient Descent

Stochastic gradient descent (SGD) still is the workhorse for many practi...
research
03/26/2021

Exploiting Adam-like Optimization Algorithms to Improve the Performance of Convolutional Neural Networks

Stochastic gradient descent (SGD) is the main approach for training deep...
research
09/15/2022

Random initialisations performing above chance and how to find them

Neural networks trained with stochastic gradient descent (SGD) starting ...
research
05/21/2019

Time-Smoothed Gradients for Online Forecasting

Here, we study different update rules in stochastic gradient descent (SG...
research
09/19/2020

Learning a Lie Algebra from Unlabeled Data Pairs

Deep convolutional networks (convnets) show a remarkable ability to lear...

Please sign up or login with your details

Forgot password? Click here to reset