Hessian Eigenspectra of More Realistic Nonlinear Models

03/02/2021
by   Zhenyu Liao, et al.
0

Given an optimization problem, the Hessian matrix and its eigenspectrum can be used in many ways, ranging from designing more efficient second-order algorithms to performing model analysis and regression diagnostics. When nonlinear models and non-convex problems are considered, strong simplifying assumptions are often made to make Hessian spectral analysis more tractable. This leads to the question of how relevant the conclusions of such analyses are for more realistic nonlinear models. In this paper, we exploit deterministic equivalent techniques from random matrix theory to make a precise characterization of the Hessian eigenspectra for a broad family of nonlinear models, including models that generalize the classical generalized linear models, without relying on strong simplifying assumptions used previously. We show that, depending on the data properties, the nonlinear response model, and the loss function, the Hessian can have qualitatively different spectral behaviors: of bounded or unbounded support, with single- or multi-bulk, and with isolated eigenvalues on the left- or right-hand side of the bulk. By focusing on such a simple but nontrivial nonlinear model, our analysis takes a step forward to unveil the theoretical origin of many visually striking features observed in more complex machine learning models.

READ FULL TEXT
research
12/14/2020

A spectral characterization and an approximation scheme for the Hessian eigenvalue

We revisit the k-Hessian eigenvalue problem on a smooth, bounded, (k-1)-...
research
02/06/2019

Negative eigenvalues of the Hessian in deep neural networks

The loss function of deep networks is known to be non-convex but the pre...
research
03/30/2023

A Note On Nonlinear Regression Under L2 Loss

We investigate the nonlinear regression problem under L2 loss (square lo...
research
12/22/2019

Modeling Hessian-vector products in nonlinear optimization: New Hessian-free methods

In this paper, we suggest two ways of calculating interpolation models f...
research
07/28/2022

Adaptive Second Order Coresets for Data-efficient Machine Learning

Training machine learning models on massive datasets incurs substantial ...
research
11/22/2016

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

We look at the eigenvalues of the Hessian of a loss function before and ...
research
12/08/2021

Learning Linear Models Using Distributed Iterative Hessian Sketching

This work considers the problem of learning the Markov parameters of a l...

Please sign up or login with your details

Forgot password? Click here to reset