Limitations of Neural Collapse for Understanding Generalization in Deep Learning

02/17/2022
by   Like Hui, et al.
0

The recent work of Papyan, Han, Donoho (2020) presented an intriguing "Neural Collapse" phenomenon, showing a structural property of interpolating classifiers in the late stage of training. This opened a rich area of exploration studying this phenomenon. Our motivation is to study the upper limits of this research program: How far will understanding Neural Collapse take us in understanding deep learning? First, we investigate its role in generalization. We refine the Neural Collapse conjecture into two separate conjectures: collapse on the train set (an optimization property) and collapse on the test distribution (a generalization property). We find that while Neural Collapse often occurs on the train set, it does not occur on the test set. We thus conclude that Neural Collapse is primarily an optimization phenomenon, with as-yet-unclear connections to generalization. Second, we investigate the role of Neural Collapse in feature learning. We show simple, realistic experiments where training longer leads to worse last-layer features, as measured by transfer-performance on a downstream task. This suggests that neural collapse is not always desirable for representation learning, as previously claimed. Finally, we give preliminary evidence of a "cascading collapse" phenomenon, wherein some form of Neural Collapse occurs not only for the last layer, but in earlier layers as well. We hope our work encourages the community to continue the rich line of Neural Collapse research, while also considering its inherent limitations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2021

On the geometry of generalization and memorization in deep neural networks

Understanding how large neural networks avoid memorizing training data i...
research
01/29/2021

Layer-Peeled Model: Toward Understanding Well-Trained Deep Neural Networks

In this paper, we introduce the Layer-Peeled Model, a nonconvex yet anal...
research
01/10/2020

Data-Dependence of Plateau Phenomenon in Learning with Neural Network — Statistical Mechanical Analysis

The plateau phenomenon, wherein the loss value stops decreasing during t...
research
10/09/2017

Function space analysis of deep learning representation layers

In this paper we propose a function space approach to Representation Lea...
research
02/26/2021

Experiments with Rich Regime Training for Deep Learning

In spite of advances in understanding lazy training, recent work attribu...
research
05/29/2022

A Model of One-Shot Generalization

We provide a theoretical framework to study a phenomenon that we call on...
research
05/20/2022

Towards Understanding Grokking: An Effective Theory of Representation Learning

We aim to understand grokking, a phenomenon where models generalize long...

Please sign up or login with your details

Forgot password? Click here to reset