The Equilibrium Hypothesis: Rethinking implicit regularization in Deep Neural Networks

10/22/2021
by   Yizhang Lou, et al.
19

Modern Deep Neural Networks (DNNs) exhibit impressive generalization properties on a variety of tasks without explicit regularization, suggesting the existence of hidden regularization effects. Recent work by Baratin et al. (2021) sheds light on an intriguing implicit regularization effect, showing that some layers are much more aligned with data labels than other layers. This suggests that as the network grows in depth and width, an implicit layer selection phenomenon occurs during training. In this work, we provide the first explanation for this alignment hierarchy. We introduce and empirically validate the Equilibrium Hypothesis which states that the layers that achieve some balance between forward and backward information loss are the ones with the highest alignment to data labels. Our experiments demonstrate an excellent match with the theoretical predictions.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/11/2021

Intraclass clustering: an implicit learning ability that regularizes DNNs

Several works have shown that the regularization mechanisms underlying d...
08/03/2020

Implicit Regularization in Deep Learning: A View from Function Space

We approach the problem of implicit regularization in deep learning from...
12/07/2020

Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Renormalization Group

The success of deep learning in many real-world tasks has triggered an e...
06/11/2020

Deep Learning Requires Explicit Regularization for Reliable Predictive Probability

From the statistical learning perspective, complexity control via explic...
10/20/2021

Convergence Analysis and Implicit Regularization of Feedback Alignment for Deep Linear Networks

We theoretically analyze the Feedback Alignment (FA) algorithm, an effic...
02/15/2021

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers

A deep equilibrium model uses implicit layers, which are implicitly defi...
08/26/2019

A Probabilistic Representation of Deep Learning

In this work, we introduce a novel probabilistic representation of deep ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.