A Hessian Based Complexity Measure for Deep Networks

05/28/2019
by   Hamid Javadi, et al.
0

Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the overparameterized regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and role of (explicit or implicit) regularization in these networks is of great importance. Inspired by the seminal work of Donoho and Grimes in manifold learning, we develop a new measure for the complexity of the function generated by a deep network based on the integral of the norm of the tangent Hessian. This complexity measure can be used to quantify the irregularity of the function a deep network fits to training data or as a regularization penalty for deep network learning. Indeed, we show that the oft-used heuristic of data augmentation imposes an implicit Hessian regularization during learning. We demonstrate the utility of our new complexity measure through a range of learning experiments.

READ FULL TEXT
research
09/14/2020

Input Hessian Regularization of Neural Networks

Regularizing the input gradient has shown to be effective in promoting t...
research
06/13/2019

CoopSubNet: Cooperating Subnetwork for Data-Driven Regularization of Deep Networks under Limited Training Budgets

Deep networks are an integral part of the current machine learning parad...
research
11/25/2022

The smooth output assumption, and why deep networks are better than wide ones

When several models have similar training scores, classical model select...
research
08/24/2020

The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement

Existing disentanglement methods for deep generative models rely on hand...
research
12/07/2020

Why Unsupervised Deep Networks Generalize

Promising resolutions of the generalization puzzle observe that the actu...
research
02/06/2019

A Scale Invariant Flatness Measure for Deep Network Minima

It has been empirically observed that the flatness of minima obtained fr...
research
05/02/2017

Redundancy in active paths of deep networks: a random active path model

Deep learning has become a powerful and popular tool for a variety of ma...

Please sign up or login with your details

Forgot password? Click here to reset