Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks

10/08/2020
by   Yikai Wu, et al.
0

Hessian captures important properties of the deep neural network loss landscape. We observe that eigenvectors and eigenspaces of the layer-wise Hessian for neural network objective have several interesting structures – top eigenspaces for different models have high overlap, and top eigenvectors form low rank matrices when they are reshaped into the same shape as the corresponding weight matrix. These structures, as well as the low rank structure of the Hessian observed in previous studies, can be explained by approximating the Hessian using Kronecker factorization. Our new understanding can also explain why some of these structures become weaker when the network is trained with batch normalization. Finally, we show that the Kronecker factorization can be combined with PAC-Bayes techniques to get better explicit generalization bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2021

Analytic Insights into Structure and Rank of Neural Network Hessian Maps

The Hessian of a neural network captures parameter interactions through ...
research
09/19/2018

Identifying Generalization Properties in Neural Networks

While it has not yet been proven, empirical evidence suggests that model...
research
04/08/2016

On the Hessian of Shape Matching Energy

In this technical report we derive the analytic form of the Hessian matr...
research
02/07/2020

Low Rank Saddle Free Newton: Algorithm and Analysis

Many tasks in engineering fields and machine learning involve minimizing...
research
05/16/2023

The Hessian perspective into the Nature of Convolutional Neural Networks

While Convolutional Neural Networks (CNNs) have long been investigated a...
research
10/20/2017

Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods

Our goal is to improve variance reducing stochastic methods through bett...
research
06/09/2023

Estimation of Ridge Using Nonlinear Transformation on Density Function

Ridges play a vital role in accurately approximating the underlying stru...

Please sign up or login with your details

Forgot password? Click here to reset