Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity

by   Shiyu Liang, et al.

Traditional landscape analysis of deep neural networks aims to show that no sub-optimal local minima exist in some appropriate sense. From this, one may be tempted to conclude that descent algorithms which escape saddle points will reach a good local minimum. However, basic optimization theory tell us that it is also possible for a descent algorithm to diverge to infinity if there are paths leading to infinity, along which the loss function decreases. It is not clear whether for non-linear neural networks there exists one setting that no bad local-min and no decreasing paths to infinity can be simultaneously achieved. In this paper, we give the first positive answer to this question. More specifically, for a large class of over-parameterized deep neural networks with appropriate regularizers, the loss function has no bad local minima and no decreasing paths to infinity. The key mathematical trick is to show that the set of regularizers which may be undesirable can be viewed as the image of a Lipschitz continuous mapping from a lower-dimensional Euclidean space to a higher-dimensional Euclidean space, and thus has zero measure.


page 1

page 2

page 3

page 4


The Global Landscape of Neural Networks: An Overview

One of the major concerns for neural network training is that the non-co...

Geometry of energy landscapes and the optimizability of deep neural networks

Deep neural networks are workhorse models in machine learning with multi...

On the loss landscape of a class of deep neural networks with no bad local valleys

We identify a class of over-parameterized deep neural networks with stan...

Sub-Optimal Local Minima Exist for Almost All Over-parameterized Neural Networks

Does over-parameterization eliminate sub-optimal local minima for neural...

Adding One Neuron Can Eliminate All Bad Local Minima

One of the main difficulties in analyzing neural networks is the non-con...

Who breaks early, looses: goal oriented training of deep neural networks based on port Hamiltonian dynamics

The highly structured energy landscape of the loss as a function of para...

Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry

We systematize the approach to the investigation of deep neural network ...

Please sign up or login with your details

Forgot password? Click here to reset