PyHessian: Neural Networks Through the Lens of the Hessian

12/16/2019 ∙ by Zhewei Yao, et al. ∙ 0

We present PyHessian, a new scalable framework that enables fast computation of Hessian (i.e., second-order derivative) information for deep neural networks. This framework is developed in Pytorch, and it enables distributed-memory execution on cloud or supercomputer systems. PyHessian enables fast computations of the top Hessian eigenvalue, the Hessian trace, and the full Hessian eigenvalue density. This general framework can be used to analyze neural network models, including the topology of the loss landscape (i.e., curvature information) to gain insight into the behavior of different models/optimizers. To illustrate this, we apply PyHessian to analyze the effect of residual connections and Batch Normalization layers on the smoothness of the loss landscape. One recent claim, based on simpler first-order analysis, is that residual connections and batch normalization make the loss landscape “smoother”, thus making it easier for Stochastic Gradient Descent to converge to a good solution. We perform an extensive analysis by measuring directly the Hessian spectrum using PyHessian. This analysis leads to finer-scale insight, demonstrating that while conventional wisdom is sometimes validated, in other cases it is simply incorrect. In particular, we find that batch normalization layers do not necessarily make the loss landscape smoother, especially for shallow networks. Instead, the claimed smoother loss landscape only becomes evident for deep neural networks. We perform extensive experiments on four residual networks (ResNet20/32/38/56) on Cifar-10/100 dataset. We have open-sourced our PyHessian framework for Hessian spectrum computation.



There are no comments yet.


page 2

page 26

page 27

page 28

page 29

page 30

page 31

page 32

Code Repositories


PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.