Sub-Optimal Local Minima Exist for Almost All Over-parameterized Neural Networks

11/04/2019
by   Tian Ding, et al.
0

Does over-parameterization eliminate sub-optimal local minima for neural network problems? On one hand, existing positive results do not prove the claim, but often weaker claims. On the other hand, existing negative results have strong assumptions on the activation functions and/or data samples, causing a large gap with positive results. It was unclear before whether there is a clean answer of "yes" or "no". In this paper, we answer this question with a strong negative result. In particular, we prove that for deep and over-parameterized networks, sub-optimal local minima exist for generic input data samples and generic nonlinear activation. This is the setting widely studied in the global landscape of over-parameterized networks, thus our result corrects a possible misconception that "over-parameterization eliminates sub-optimal local-min". Our construction is based on fundamental optimization analysis, and thus rather principled.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2018

A Critical View of Global Optimality in Deep Learning

We investigate the loss surface of deep linear and nonlinear neural netw...
research
12/28/2018

Over-Parameterized Deep Neural Networks Have No Strict Local Minima For Any Continuous Activations

In this paper, we study the loss surface of the over-parameterized fully...
research
07/02/2020

The Global Landscape of Neural Networks: An Overview

One of the major concerns for neural network training is that the non-co...
research
09/27/2018

On the loss landscape of a class of deep neural networks with no bad local valleys

We identify a class of over-parameterized deep neural networks with stan...
research
12/31/2019

Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity

Traditional landscape analysis of deep neural networks aims to show that...
research
01/16/2014

Analyzing Search Topology Without Running Any Search: On the Connection Between Causal Graphs and h+

The ignoring delete lists relaxation is of paramount importance for both...
research
11/20/2018

Effect of Depth and Width on Local Minima in Deep Learning

In this paper, we analyze the effects of depth and width on the quality ...

Please sign up or login with your details

Forgot password? Click here to reset