Critical Points of Neural Networks: Analytical Forms and Landscape Properties

10/30/2017
by   Yi Zhou, et al.
0

Due to the success of deep learning to solving a variety of challenging machine learning tasks, there is a rising interest in understanding loss functions for training neural networks from a theoretical aspect. Particularly, the properties of critical points and the landscape around them are of importance to determine the convergence performance of optimization algorithms. In this paper, we provide full (necessary and sufficient) characterization of the analytical forms for the critical points (as well as global minimizers) of the square loss functions for various neural networks. We show that the analytical forms of the critical points characterize the values of the corresponding loss functions as well as the necessary and sufficient conditions to achieve global minimum. Furthermore, we exploit the analytical forms of the critical points to characterize the landscape properties for the loss functions of these neural networks. One particular conclusion is that: The loss function of linear networks has no spurious local minimum, while the loss function of one-hidden-layer nonlinear networks with ReLU activation function does have local minimum that is not global minimum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2017

Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

The past decade has witnessed a successful application of deep learning ...
research
07/08/2017

Global optimality conditions for deep neural networks

We study the error landscape of deep linear and nonlinear neural network...
research
10/09/2018

Learning One-hidden-layer Neural Networks under General Input Distributions

Significant advances have been made recently on training neural networks...
research
01/20/2019

Training Neural Networks with Local Error Signals

Supervised training of neural networks for classification is typically p...
research
01/30/2023

Complex Critical Points of Deep Linear Neural Networks

We extend the work of Mehta, Chen, Tang, and Hauenstein on computing the...
research
05/08/2020

The critical locus of overparameterized neural networks

Many aspects of the geometry of loss functions in deep learning remain m...
research
05/13/2018

The Global Optimization Geometry of Shallow Linear Neural Networks

We examine the squared error loss landscape of shallow linear neural net...

Please sign up or login with your details

Forgot password? Click here to reset