The Equilibrium Hypothesis: Rethinking implicit regularization in Deep Neural Networks

by   Yizhang Lou, et al.

Modern Deep Neural Networks (DNNs) exhibit impressive generalization properties on a variety of tasks without explicit regularization, suggesting the existence of hidden regularization effects. Recent work by Baratin et al. (2021) sheds light on an intriguing implicit regularization effect, showing that some layers are much more aligned with data labels than other layers. This suggests that as the network grows in depth and width, an implicit layer selection phenomenon occurs during training. In this work, we provide the first explanation for this alignment hierarchy. We introduce and empirically validate the Equilibrium Hypothesis which states that the layers that achieve some balance between forward and backward information loss are the ones with the highest alignment to data labels. Our experiments demonstrate an excellent match with the theoretical predictions.



There are no comments yet.


page 1

page 2

page 3

page 4


Intraclass clustering: an implicit learning ability that regularizes DNNs

Several works have shown that the regularization mechanisms underlying d...

Implicit Regularization in Deep Learning: A View from Function Space

We approach the problem of implicit regularization in deep learning from...

Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Renormalization Group

The success of deep learning in many real-world tasks has triggered an e...

Deep Learning Requires Explicit Regularization for Reliable Predictive Probability

From the statistical learning perspective, complexity control via explic...

Convergence Analysis and Implicit Regularization of Feedback Alignment for Deep Linear Networks

We theoretically analyze the Feedback Alignment (FA) algorithm, an effic...

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers

A deep equilibrium model uses implicit layers, which are implicitly defi...

A Probabilistic Representation of Deep Learning

In this work, we introduce a novel probabilistic representation of deep ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.