Symmetry critical points for a model shallow neural network

03/23/2020
by   Yossi Arjevani, et al.
0

A detailed analysis is given of a family of critical points determining spurious minima for a model student-teacher 2-layer neural network, with ReLU activation function, and a natural Γ = S_k × S_k-symmetry. For a k-neuron shallow network of this type, analytic equations are given which, for example, determine the critical points of the spurious minima described by Safran and Shamir (2018) for 6 < k < 20. These critical points have isotropy (conjugate to) the diagonal subgroup Δ S_k-1⊂Δ S_k of Γ. It is shown that critical points of this family can be expressed as an infinite series in 1/√(k) (for large enough k) and, as an application, the critical values decay like a k^-1, where a ≈ 0.3. Other non-trivial families of critical points are also described with isotropy conjugate to Δ S_k-1, Δ S_k and Δ (S_2× S_k-2) (the latter giving spurious minima for k> 9). The methods used depend on symmetry breaking, bifurcation, and algebraic geometry, notably Artin's implicit function theorem, and are applicable to other families of critical points that occur in this network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2023

Symmetry Critical Points for Symmetric Tensor Decomposition Problems

We consider the non-convex optimization problem associated with the deco...
research
07/06/2021

Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of S_n

Motivated by questions originating from the study of a class of shallow ...
research
10/12/2022

Annihilation of Spurious Minima in Two-Layer ReLU Networks

We study the optimization problem associated with fitting two-layer ReLU...
research
05/25/2021

Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

We study how permutation symmetries in overparameterized multi-layer neu...
research
04/08/2021

Numerics and analysis of Cahn–Hilliard critical points

We explore recent progress and open questions concerning local minima an...
research
08/04/2020

Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry

We consider the optimization problem associated with fitting two-layers ...
research
04/12/2023

Function Space and Critical Points of Linear Convolutional Networks

We study the geometry of linear networks with one-dimensional convolutio...

Please sign up or login with your details

Forgot password? Click here to reset