Annihilation of Spurious Minima in Two-Layer ReLU Networks

10/12/2022
by   Yossi Arjevani, et al.
0

We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss, where labels are generated by a target network. Use is made of the rich symmetry structure to develop a novel set of tools for studying the mechanism by which over-parameterization annihilates spurious minima. Sharp analytic estimates are obtained for the loss and the Hessian spectrum at different minima, and it is proved that adding neurons can turn symmetric spurious minima into saddles; minima of lesser symmetry require more neurons. Using Cauchy's interlacing theorem, we prove the existence of descent directions in certain subspaces arising from the symmetry structure of the loss function. This analytic approach uses techniques, new to the field, from algebraic geometry, representation theory and symmetry breaking, and confirms rigorously the effectiveness of over-parameterization in making the associated loss landscape accessible to gradient-based methods. For a fixed number of neurons and inputs, the spectral results remain true under symmetry breaking perturbation of the target.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2021

Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks

We study the optimization problem associated with fitting two-layer ReLU...
research
12/26/2019

Spurious Local Minima of Shallow ReLU Networks Conform with the Symmetry of the Target Model

We consider the optimization problem associated with fitting two-layer R...
research
07/06/2021

Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of S_n

Motivated by questions originating from the study of a class of shallow ...
research
08/04/2020

Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry

We consider the optimization problem associated with fitting two-layers ...
research
06/01/2020

The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks

We study the effects of mild over-parameterization on the optimization l...
research
11/12/2022

On the High Symmetry of Neural Network Functions

Training neural networks means solving a high-dimensional optimization p...
research
03/23/2020

Symmetry critical points for a model shallow neural network

A detailed analysis is given of a family of critical points determining ...

Please sign up or login with your details

Forgot password? Click here to reset