Layer-Peeled Model: Toward Understanding Well-Trained Deep Neural Networks

01/29/2021
by   Cong Fang, et al.
0

In this paper, we introduce the Layer-Peeled Model, a nonconvex yet analytically tractable optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this new model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which in part explains the recently discovered phenomenon of neural collapse in deep learning training [PHD20]. Moreover, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before its confirmation by our computational experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2023

Quantifying the Variability Collapse of Neural Networks

Recent studies empirically demonstrate the positive relationship between...
research
01/01/2023

Neural Collapse in Deep Linear Network: From Balanced to Imbalanced Data

Modern deep neural networks have achieved superhuman performance in task...
research
02/17/2022

Limitations of Neural Collapse for Understanding Generalization in Deep Learning

The recent work of Papyan, Han, Donoho (2020) presented an intriguin...
research
03/17/2022

Do We Really Need a Learnable Classifier at the End of Deep Neural Network?

Modern deep neural networks for classification usually jointly learn a b...
research
10/03/2022

Plateau in Monotonic Linear Interpolation – A "Biased" View of Loss Landscape for Deep Networks

Monotonic linear interpolation (MLI) - on the line connecting a random i...
research
03/02/2023

Understanding plasticity in neural networks

Plasticity, the ability of a neural network to quickly change its predic...
research
10/31/2022

Class Interference of Deep Neural Networks

Recognizing and telling similar objects apart is even hard for human bei...

Please sign up or login with your details

Forgot password? Click here to reset