The asymptotic spectrum of the Hessian of DNN throughout training

10/01/2019
by   Arthur Jacot, et al.
0

The dynamics of DNNs during gradient descent is described by the so-called Neural Tangent Kernel (NTK). In this article, we show that the NTK allows one to gain precise insight into the Hessian of the cost of DNNs: we obtain a full characterization of the asymptotics of the spectrum of the Hessian, at initialization and during training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2019

How noise affects the Hessian spectrum in overparameterized neural networks

Stochastic gradient descent (SGD) forms the core optimization method for...
research
11/16/2018

The Full Spectrum of Deep Net Hessians At Scale: Dynamics with Sample Size

Previous works observed the spectrum of the Hessian of the training loss...
research
01/24/2019

Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians

We consider deep classifying neural networks. We expose a structure in t...
research
12/16/2019

PyHessian: Neural Networks Through the Lens of the Hessian

We present PyHessian, a new scalable framework that enables fast computa...
research
10/03/2018

Combining Natural Gradient with Hessian Free Methods for Sequence Training

This paper presents a new optimisation approach to train Deep Neural Net...
research
01/29/2019

An Investigation into Neural Net Optimization via Hessian Eigenvalue Density

To understand the dynamics of optimization in deep neural networks, we d...
research
04/18/2023

Hessian and increasing-Hessian orderings of multivariate skew-elliptical random vectors

In this work, we establish some stochastic comparison results for multiv...

Please sign up or login with your details

Forgot password? Click here to reset