On the High Symmetry of Neural Network Functions

11/12/2022
by   Umberto Michelucci, et al.
0

Training neural networks means solving a high-dimensional optimization problem. Normally the goal is to minimize a loss function that depends on what is called the network function, or in other words the function that gives the network output given a certain input. This function depends on a large number of parameters, also known as weights, that depends on the network architecture. In general the goal of this optimization problem is to find the global minimum of the network function. In this paper it is discussed how due to how neural networks are designed, the neural network function present a very large symmetry in the parameter space. This work shows how the neural network function has a number of equivalent minima, in other words minima that give the same value for the loss function and the same exact output, that grows factorially with the number of neurons in each layer for feed forward neural network or with the number of filters in a convolutional neural networks. When the number of neurons and layers is large, the number of equivalent minima grows extremely fast. This will have of course consequences for the study of how neural networks converges to minima during training. This results is known, but in this paper for the first time a proper mathematical discussion is presented and an estimate of the number of equivalent minima is derived.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2019

Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape

The permutation symmetry of neurons in each layer of a deep neural netwo...
research
05/24/2019

Loss Surface Modality of Feed-Forward Neural Network Architectures

It has been argued in the past that high-dimensional neural networks do ...
research
10/12/2022

Annihilation of Spurious Minima in Two-Layer ReLU Networks

We study the optimization problem associated with fitting two-layer ReLU...
research
04/26/2018

The loss landscape of overparameterized neural networks

We explore some mathematical features of the loss landscape of overparam...
research
11/29/2019

Barcodes as summary of objective function's topology

We apply the canonical forms (barcodes) of gradient Morse complexes to e...
research
08/22/2023

A free from local minima algorithm for training regressive MLP neural networks

In this article an innovative method for training regressive MLP network...
research
05/25/2021

Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

We study how permutation symmetries in overparameterized multi-layer neu...

Please sign up or login with your details

Forgot password? Click here to reset