On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features

03/02/2022
by   Jinxin Zhou, et al.
4

When training deep neural networks for classification tasks, an intriguing empirical phenomenon has been widely observed in the last-layer classifiers and features, where (i) the class means and the last-layer classifiers all collapse to the vertices of a Simplex Equiangular Tight Frame (ETF) up to scaling, and (ii) cross-example within-class variability of last-layer activations collapses to zero. This phenomenon is called Neural Collapse (NC), which seems to take place regardless of the choice of loss functions. In this work, we justify NC under the mean squared error (MSE) loss, where recent empirical evidence shows that it performs comparably or even better than the de-facto cross-entropy loss. Under a simplified unconstrained feature model, we provide the first global landscape analysis for vanilla nonconvex MSE loss and show that the (only!) global minimizers are neural collapse solutions, while all other critical points are strict saddles whose Hessian exhibit negative curvature directions. Furthermore, we justify the usage of rescaled MSE loss by probing the optimization landscape around the NC solutions, showing that the landscape can be improved by tuning the rescaling hyperparameters. Finally, our theoretical findings are experimentally verified on practical network architectures.

READ FULL TEXT
research
05/06/2021

A Geometric Analysis of Neural Collapse with Unconstrained Features

We provide the first global optimization landscape analysis of Neural Co...
research
10/04/2022

Are All Losses Created Equal: A Neural Collapse Perspective

While cross entropy (CE) is the most commonly used loss to train deep ne...
research
09/19/2022

Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

When training overparameterized deep networks for classification tasks, ...
research
01/01/2023

Neural Collapse in Deep Linear Network: From Balanced to Imbalanced Data

Modern deep neural networks have achieved superhuman performance in task...
research
02/16/2022

Extended Unconstrained Features Model for Exploring Deep Neural Collapse

The modern strategy for training deep neural networks for classification...
research
06/03/2021

Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path

Recent work [Papyan, Han, and Donoho, 2020] discovered a phenomenon call...
research
05/04/2023

Contrastive losses as generalized models of global epistasis

Fitness functions map large combinatorial spaces of biological sequences...

Please sign up or login with your details

Forgot password? Click here to reset