A Note on the Implicit Bias Towards Minimal Depth of Deep Neural Networks

02/18/2022
by   Tomer Galanti, et al.
1

Deep learning systems have steadily advanced the state of the art in a wide variety of benchmarks, demonstrating impressive performance in tasks ranging from image classification <cit.>, language processing <cit.>, open-ended environments <cit.>, to coding <cit.>. A central aspect that enables the success of these systems is the ability to train deep models instead of wide shallow ones <cit.>. Intuitively, a neural network is decomposed into hierarchical representations from raw data to high-level, more abstract features. While training deep neural networks repetitively achieves superior performance against their shallow counterparts, an understanding of the role of depth in representation learning is still lacking. In this work, we suggest a new perspective on understanding the role of depth in deep learning. We hypothesize that SGD training of overparameterized neural networks exhibits an implicit bias that favors solutions of minimal effective depth. Namely, SGD trains neural networks for which the top several layers are redundant. To evaluate the redundancy of layers, we revisit the recently discovered phenomenon of neural collapse <cit.>.

READ FULL TEXT

page 1

page 2

page 3

research
05/02/2016

Simple2Complex: Global Optimization by Gradient Descent

A method named simple2complex for modeling and training deep neural netw...
research
12/27/2022

Langevin algorithms for very deep Neural Networks with application to image classification

Training a very deep neural network is a challenging task, as the deeper...
research
06/17/2021

Exploring the Properties and Evolution of Neural Network Eigenspaces during Training

In this work we explore the information processing inside neural network...
research
11/01/2018

Implicit Regularization of Stochastic Gradient Descent in Natural Language Processing: Observations and Implications

Deep neural networks with remarkably strong generalization performances ...
research
06/24/2020

Towards Understanding Hierarchical Learning: Benefits of Neural Representations

Deep neural networks can empirically perform efficient hierarchical lear...
research
09/15/2018

Towards Better Interpretability in Deep Q-Networks

Deep reinforcement learning techniques have demonstrated superior perfor...
research
11/11/2022

Multilevel-in-Layer Training for Deep Neural Network Regression

A common challenge in regression is that for many problems, the degrees ...

Please sign up or login with your details

Forgot password? Click here to reset