Prevalence of Neural Collapse during the terminal phase of deep learning training

08/18/2020
by   Vardan Papyan, et al.
27

Modern practice for training classification deepnets involves a Terminal Phase of Training (TPT), which begins at the epoch where training error first vanishes; During TPT, the training error stays effectively zero while training loss is pushed towards zero. Direct measurements of TPT, for three prototypical deepnet architectures and across seven canonical classification datasets, expose a pervasive inductive bias we call Neural Collapse, involving four deeply interconnected phenomena: (NC1) Cross-example within-class variability of last-layer training activations collapses to zero, as the individual activations themselves collapse to their class-means; (NC2) The class-means collapse to the vertices of a Simplex Equiangular Tight Frame (ETF); (NC3) Up to rescaling, the last-layer classifiers collapse to the class-means, or in other words to the Simplex ETF, i.e. to a self-dual configuration; (NC4) For a given activation, the classifier's decision collapses to simply choosing whichever class has the closest train class-mean, i.e. the Nearest Class Center (NCC) decision rule. The symmetric and very simple geometry induced by the TPT confers important benefits, including better generalization performance, better robustness, and better interpretability.

READ FULL TEXT
research
05/06/2021

A Geometric Analysis of Neural Collapse with Unconstrained Features

We provide the first global optimization landscape analysis of Neural Co...
research
01/21/2022

Nearest Class-Center Simplification through Intermediate Layers

Recent advances in theoretical Deep Learning have introduced geometric p...
research
06/08/2022

Neural Collapse: A Review on Modelling Principles and Generalization

With a recent observation of the "Neural Collapse (NC)" phenomena by Pap...
research
06/03/2021

Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path

Recent work [Papyan, Han, and Donoho, 2020] discovered a phenomenon call...
research
10/29/2022

Perturbation Analysis of Neural Collapse

Training deep neural networks for classification often includes minimizi...
research
05/19/2023

Towards understanding neural collapse in supervised contrastive learning with the information bottleneck method

Neural collapse describes the geometry of activation in the final layer ...

Please sign up or login with your details

Forgot password? Click here to reset