A Modular Approach to Block-diagonal Hessian Approximations for Second-order Optimization Methods

02/05/2019
by   Felix Dangel, et al.
0

We propose a modular extension of the backpropagation algorithm for computation of the block diagonal of the training objective's Hessian to various levels of refinement. The approach compartmentalizes the otherwise tedious construction of the Hessian into local modules. It is applicable to feedforward neural network architectures, and can be integrated into existing machine learning libraries with relatively little overhead, facilitating the development of novel second-order optimization methods. Our formulation subsumes several recently proposed block-diagonal approximation schemes as special cases. Our PyTorch implementation is included with the paper.

READ FULL TEXT

page 5

page 13

research
10/20/2022

HesScale: Scalable Computation of Hessian Diagonals

Second-order optimization uses curvature information about the objective...
research
06/01/2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

We introduce AdaHessian, a second order stochastic optimization algorith...
research
11/05/2016

Loss-aware Binarization of Deep Networks

Deep neural network models, though very powerful and highly successful, ...
research
06/24/2020

Randomized Block-Diagonal Preconditioning for Parallel Learning

We study preconditioned gradient-based optimization methods where the pr...
research
06/27/2012

Estimating the Hessian by Back-propagating Curvature

In this work we develop Curvature Propagation (CP), a general technique ...
research
01/19/2023

On backpropagating Hessians through ODEs

We discuss the problem of numerically backpropagating Hessians through o...
research
05/23/2023

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

Given the massive cost of language model pre-training, a non-trivial imp...

Please sign up or login with your details

Forgot password? Click here to reset