Fast Evaluation and Approximation of the Gauss-Newton Hessian Matrix for the Multilayer Perceptron

10/27/2019
by   Chao Chen, et al.
0

We introduce a fast algorithm for entry-wise evaluation of the Gauss-Newton Hessian (GNH) matrix for the multilayer perceptron. The algorithm has a precomputation step and a sampling step. While it generally requires O(Nn) work to compute an entry (and the entire column) in the GNH matrix for a neural network with N parameters and n data points, our fast sampling algorithm reduces the cost to O(n+d/ϵ^2) work, where d is the output dimension of the network and ϵ is a prescribed accuracy (independent of N). One application of our algorithm is constructing the hierarchical-matrix () approximation of the GNH matrix for solving linear systems and eigenvalue problems. While it generally requires O(N^2) memory and O(N^3) work to store and factorize the GNH matrix, respectively. The approximation requires only (N r_o) memory footprint and (N r_o^2) work to be factorized, where r_o ≪ N is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the approximation on classification and autoencoder neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2017

Block-diagonal Hessian-free Optimization for Training Neural Networks

Second-order methods for neural network optimization have several advant...
research
06/16/2020

Practical Quasi-Newton Methods for Training Deep Neural Networks

We consider the development of practical stochastic quasi-Newton, and in...
research
02/13/2019

Do Subsampled Newton Methods Work for High-Dimensional Data?

Subsampled Newton methods approximate Hessian matrices through subsampli...
research
09/03/2022

Quadratic Gradient: Uniting Gradient Algorithm and Newton Method as One

It might be inadequate for the line search technique for Newton's method...
research
10/25/2019

Convergence Analysis of the Randomized Newton Method with Determinantal Sampling

We analyze the convergence rate of the Randomized Newton Method (RNM) in...
research
11/21/2022

Efficient Second-Order Plane Adjustment

Planes are generally used in 3D reconstruction for depth sensors, such a...
research
04/06/2023

A matrix algebra approach to approximate Hessians

This work presents a novel matrix-based method for constructing an appro...

Please sign up or login with your details

Forgot password? Click here to reset