The Equilibrium Hypothesis: Rethinking implicit regularization in Deep Neural Networks

10/22/2021
by   Yizhang Lou, et al.
19

Modern Deep Neural Networks (DNNs) exhibit impressive generalization properties on a variety of tasks without explicit regularization, suggesting the existence of hidden regularization effects. Recent work by Baratin et al. (2021) sheds light on an intriguing implicit regularization effect, showing that some layers are much more aligned with data labels than other layers. This suggests that as the network grows in depth and width, an implicit layer selection phenomenon occurs during training. In this work, we provide the first explanation for this alignment hierarchy. We introduce and empirically validate the Equilibrium Hypothesis which states that the layers that achieve some balance between forward and backward information loss are the ones with the highest alignment to data labels. Our experiments demonstrate an excellent match with the theoretical predictions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2021

Intraclass clustering: an implicit learning ability that regularizes DNNs

Several works have shown that the regularization mechanisms underlying d...
research
12/07/2020

Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Renormalization Group

The success of deep learning in many real-world tasks has triggered an e...
research
08/03/2020

Implicit Regularization in Deep Learning: A View from Function Space

We approach the problem of implicit regularization in deep learning from...
research
06/11/2020

Deep Learning Requires Explicit Regularization for Reliable Predictive Probability

From the statistical learning perspective, complexity control via explic...
research
08/26/2019

A Probabilistic Representation of Deep Learning

In this work, we introduce a novel probabilistic representation of deep ...
research
10/20/2021

Convergence Analysis and Implicit Regularization of Feedback Alignment for Deep Linear Networks

We theoretically analyze the Feedback Alignment (FA) algorithm, an effic...
research
02/15/2021

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers

A deep equilibrium model uses implicit layers, which are implicitly defi...

Please sign up or login with your details

Forgot password? Click here to reset