The intriguing role of module criticality in the generalization of deep networks

12/02/2019
by   Niladri S. Chatterji, et al.
13

We study the phenomenon that some modules of deep neural networks (DNNs) are more critical than others. Meaning that rewinding their parameter values back to initialization, while keeping other modules fixed at the trained parameters, results in a large drop in the network's performance. Our analysis reveals interesting properties of the loss landscape which leads us to propose a complexity measure, called module criticality, based on the shape of the valleys that connects the initial and final values of the module parameters. We formulate how generalization relates to the module criticality, and show that this measure is able to explain the superior generalization performance of some architectures over others, whereas earlier measures fail to do so.

READ FULL TEXT

page 3

page 24

research
03/10/2021

Why Flatness Correlates With Generalization For Deep Neural Networks

The intuition that local flatness of the loss landscape is correlated wi...
research
03/10/2021

Robustness to Pruning Predicts Generalization in Deep Neural Networks

Existing generalization measures that aim to capture a model's simplicit...
research
05/27/2019

Structure Learning for Neural Module Networks

Neural Module Networks, originally proposed for the task of visual quest...
research
02/03/2019

An Empirical Study on Regularization of Deep Neural Networks by Local Rademacher Complexity

Regularization of Deep Neural Networks (DNNs) for the sake of improving ...
research
07/13/2021

How many degrees of freedom do we need to train deep networks: a loss landscape perspective

A variety of recent works, spanning pruning, lottery tickets, and traini...
research
11/18/2015

ACDC: A Structured Efficient Linear Layer

The linear layer is one of the most pervasive modules in deep learning r...

Please sign up or login with your details

Forgot password? Click here to reset