HesScale: Scalable Computation of Hessian Diagonals

10/20/2022
by   Mohamed Elsayed, et al.
0

Second-order optimization uses curvature information about the objective function, which can help in faster convergence. However, such methods typically require expensive computation of the Hessian matrix, preventing their usage in a scalable way. The absence of efficient ways of computation drove the most widely used methods to focus on first-order approximations that do not capture the curvature information. In this paper, we develop HesScale, a scalable approach to approximating the diagonal of the Hessian matrix, to incorporate second-order information in a computationally efficient manner. We show that HesScale has the same computational complexity as backpropagation. Our results on supervised classification show that HesScale achieves high approximation accuracy, allowing for scalable and efficient second-order optimization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2019

A Modular Approach to Block-diagonal Hessian Approximations for Second-order Optimization Methods

We propose a modular extension of the backpropagation algorithm for comp...
research
10/15/2019

Adjoint-based exact Hessian-vector multiplication using symplectic Runge–Kutta methods

We consider a function of the numerical solution of an initial value pro...
research
05/04/2022

Second-Order Sensitivity Analysis for Bilevel Optimization

In this work we derive a second-order approach to bilevel optimization, ...
research
06/20/2018

A Distributed Second-Order Algorithm You Can Trust

Due to the rapid growth of data and computational resources, distributed...
research
01/24/2019

Curvature-Exploiting Acceleration of Elastic Net Computations

This paper introduces an efficient second-order method for solving the e...
research
11/08/2022

The Hypervolume Indicator Hessian Matrix: Analytical Expression, Computational Time Complexity, and Sparsity

The problem of approximating the Pareto front of a multiobjective optimi...
research
05/23/2023

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

Given the massive cost of language model pre-training, a non-trivial imp...

Please sign up or login with your details

Forgot password? Click here to reset