Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

05/31/2023
by   Kedar Karhadkar, et al.
0

We study the loss landscape of two-layer mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. Our approach involves bounding the dimension of the sets of local and global minima using the rank of the Jacobian of the parameterization map. Using results on random binary matrices, we show most activation patterns correspond to parameter regions with no bad differentiable local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank to many regions having deficient rank depending on the amount of overparameterization.

READ FULL TEXT

page 8

page 9

page 26

page 27

research
02/12/2020

Understanding Global Loss Landscape of One-hidden-layer ReLU Neural Networks

For one-hidden-layer ReLU networks, we show that all local minima are gl...
research
06/15/2020

Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 2: Experiments and Analysis

The existence of local minima for one-hidden-layer ReLU networks has bee...
research
01/12/2019

Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit

Recent work has noted that all bad local minima can be removed from neur...
research
12/24/2017

Spurious Local Minima are Common in Two-Layer ReLU Neural Networks

We consider the optimization problem associated with training simple ReL...
research
12/29/2017

The Multilinear Structure of ReLU Networks

We study the loss surface of neural networks equipped with a hinge loss ...
research
10/27/2020

Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures

We analyze the connection between minimizers with good generalizing prop...
research
05/22/2018

Adding One Neuron Can Eliminate All Bad Local Minima

One of the main difficulties in analyzing neural networks is the non-con...

Please sign up or login with your details

Forgot password? Click here to reset