Are All Layers Created Equal?

02/06/2019
by   Chiyuan Zhang, et al.
0

Understanding learning and generalization of deep architectures has been a major research objective in the recent years with notable theoretical progress. A main focal point of generalization studies stems from the success of excessively large networks which defy the classical wisdom of uniform convergence and learnability. We study empirically the layer-wise functional structure of over-parameterized deep models. We provide evidence for the heterogeneous characteristic of layers. To do so, we introduce the notion of (post training) re-initialization and re-randomization robustness. We show that layers can be categorized into either "robust" or "critical". In contrast to critical layers, resetting the robust layers to their initial value has no negative consequence, and in many cases they barely change throughout training. Our study provides further evidence that mere parameter counting or norm accounting is too coarse in studying generalization of deep models.

READ FULL TEXT

page 6

page 8

page 10

page 11

page 18

page 19

research
09/30/2022

On the optimization and generalization of overparameterized implicit neural networks

Implicit neural networks have become increasingly attractive in the mach...
research
01/28/2022

With Greater Distance Comes Worse Performance: On the Perspective of Layer Utilization and Model Generalization

Generalization of deep neural networks remains one of the main open prob...
research
05/30/2021

On the geometry of generalization and memorization in deep neural networks

Understanding how large neural networks avoid memorizing training data i...
research
06/24/2020

Understanding Deep Architectures with Reasoning Layer

Recently, there has been a surge of interest in combining deep learning ...
research
02/26/2021

Experiments with Rich Regime Training for Deep Learning

In spite of advances in understanding lazy training, recent work attribu...
research
03/01/2023

Learning curves for deep structured Gaussian feature models

In recent years, significant attention in deep learning theory has been ...
research
11/03/2016

Demystifying ResNet

The Residual Network (ResNet), proposed in He et al. (2015), utilized sh...

Please sign up or login with your details

Forgot password? Click here to reset