Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 2: Experiments and Analysis

06/15/2020
by   Bo Liu, et al.
0

The existence of local minima for one-hidden-layer ReLU networks has been investigated theoretically in [8]. Based on the theory, in this paper, we first analyze how big the probability of existing local minima is for 1D Gaussian data and how it varies in the whole weight space. We show that this probability is very low in most regions. We then design and implement a linear programming based approach to judge the existence of genuine local minima, and use it to predict whether bad local minima exist for the MNIST and CIFAR-10 datasets, and find that there are no bad differentiable local minima almost everywhere in weight space once some hidden neurons are activated by samples. These theoretical predictions are verified experimentally by showing that gradient descent is not trapped in the cells from which it starts. We also perform experiments to explore the count and size of differentiable cells in the weight space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2020

Understanding Global Loss Landscape of One-hidden-layer ReLU Neural Networks

For one-hidden-layer ReLU networks, we show that all local minima are gl...
research
02/10/2022

Exact Solutions of a Deep Linear Network

This work finds the exact solutions to a deep linear network with weight...
research
05/31/2023

Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

We study the loss landscape of two-layer mildly overparameterized ReLU n...
research
12/29/2017

The Multilinear Structure of ReLU Networks

We study the loss surface of neural networks equipped with a hinge loss ...
research
09/28/2018

Efficiently testing local optimality and escaping saddles for ReLU networks

We provide a theoretical algorithm for checking local optimality and esc...
research
12/22/2010

Local Minima of a Quadratic Binary Functional with a Quasi-Hebbian Connection Matrix

The local minima of a quadratic functional depending on binary variables...
research
02/19/2017

Exponentially vanishing sub-optimal local minima in multilayer neural networks

Background: Statistical mechanics results (Dauphin et al. (2014); Chorom...

Please sign up or login with your details

Forgot password? Click here to reset