On the importance of single directions for generalization

03/19/2018
by   Ari S. Morcos, et al.
0

Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Additionally, the tuning properties of single directions (defined as the activation of a single unit or some linear combination of units in response to some input) have been highlighted, but their importance has not been evaluated. Here, we connect these lines of inquiry to demonstrate that a network's reliance on single directions is a good predictor of its generalization performance, across networks trained on datasets with different fractions of corrupted labels, across ensembles of networks trained on datasets with unmodified labels, across different hyperparameters, and over the course of training. While dropout only regularizes this quantity up to a point, batch normalization implicitly discourages single direction reliance, in part by decreasing the class selectivity of individual units. Finally, we find that class selectivity is a poor predictor of task importance, suggesting not only that networks which generalize well minimize their dependence on individual units by reducing their selectivity, but also that individually selective units may not be necessary for strong network performance.

READ FULL TEXT
research
06/07/2018

Revisiting the Importance of Individual Units in CNNs via Ablation

We revisit the importance of the individual units in Convolutional Neura...
research
10/18/2019

Towards Quantifying Intrinsic Generalization of Deep ReLU Networks

Understanding the underlying mechanisms that enable the empirical succes...
research
11/27/2018

Understanding the Importance of Single Directions via Representative Substitution

Understanding the internal representations of deep neural networks (DNNs...
research
05/27/2019

Understanding Generalization of Deep Neural Networks Trained with Noisy Labels

Over-parameterized deep neural networks trained by simple first-order me...
research
11/22/2020

Towards Class-Specific Unit

Class selectivity is an attribute of a unit in deep neural networks, whi...
research
07/02/2020

Are there any 'object detectors' in the hidden layers of CNNs trained to identify objects or scenes?

Various methods of measuring unit selectivity have been developed with t...
research
04/19/2017

Network Dissection: Quantifying Interpretability of Deep Visual Representations

We propose a general framework called Network Dissection for quantifying...

Please sign up or login with your details

Forgot password? Click here to reset