On the Origins of the Block Structure Phenomenon in Neural Network Representations

02/15/2022
by   Thao Nguyen, et al.
3

Recent work has uncovered a striking phenomenon in large-capacity neural networks: they contain blocks of contiguous hidden layers with highly similar representations. This block structure has two seemingly contradictory properties: on the one hand, its constituent layers exhibit highly similar dominant first principal components (PCs), but on the other hand, their representations, and their common first PC, are highly dissimilar across different random seeds. Our work seeks to reconcile these discrepant properties by investigating the origin of the block structure in relation to the data and training methods. By analyzing properties of the dominant PCs, we find that the block structure arises from dominant datapoints - a small group of examples that share similar image statistics (e.g. background color). However, the set of dominant datapoints, and the precise shared image statistic, can vary across random seeds. Thus, the block structure reflects meaningful dataset statistics, but is simultaneously unique to each model. Through studying hidden layer activations and creating synthetic datapoints, we demonstrate that these simple image statistics dominate the representational geometry of the layers inside the block structure. We explore how the phenomenon evolves through training, finding that the block structure takes shape early in training, but the underlying representations and the corresponding dominant datapoints continue to change substantially. Finally, we study the interplay between the block structure and different training mechanisms, introducing a targeted intervention to eliminate the block structure, as well as examining the effects of pretraining and Shake-Shake regularization.

READ FULL TEXT

page 17

page 18

page 19

page 23

page 24

page 27

page 35

page 37

research
10/29/2020

Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

A key factor in the success of deep neural networks is the ability to sc...
research
07/31/2021

Structure Amplification on Multi-layer Stochastic Block Models

Much of the complexity of social, biological, and engineered systems ari...
research
05/30/2021

On the geometry of generalization and memorization in deep neural networks

Understanding how large neural networks avoid memorizing training data i...
research
12/02/2020

Global and Individualized Community Detection in Inhomogeneous Multilayer Networks

In network applications, it has become increasingly common to obtain dat...
research
04/19/2018

Low Rank Structure of Learned Representations

A key feature of neural networks, particularly deep convolutional neural...
research
10/09/2020

Permuted AdaIN: Enhancing the Representation of Local Cues in Image Classifiers

Recent work has shown that convolutional neural network classifiers over...
research
05/31/2021

Dominant Patterns: Critical Features Hidden in Deep Neural Networks

In this paper, we find the existence of critical features hidden in Deep...

Please sign up or login with your details

Forgot password? Click here to reset