The Global Optimization Geometry of Shallow Linear Neural Networks

05/13/2018
by   Zhihui Zhu, et al.
0

We examine the squared error loss landscape of shallow linear neural networks. By utilizing a regularizer on the training samples, we show---with significantly milder assumptions than previous works---that the corresponding optimization problems have benign geometric properties: there are no spurious local minima and the Hessian at every saddle point has at least one negative eigenvalue. This means that at every saddle point there is a directional negative curvature which algorithms can utilize to further decrease the objective value. These geometric properties imply that many local search algorithms---including gradient descent, which is widely utilized for training neural networks---can provably solve the training problem with global convergence. The additional regularizer has no effect on the global minimum value; rather, it plays a useful role in shrinking the set of critical points. Experiments show that this additional regularizer also speeds the convergence of iterative algorithms for solving the training optimization problem in certain cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2023

Interpolation property of shallow neural networks

We study the geometry of global minima of the loss landscape of overpara...
research
11/08/2018

A Geometric Approach of Gradient Descent Algorithms in Neural Networks

In this article we present a geometric framework to analyze convergence ...
research
10/30/2017

Critical Points of Neural Networks: Analytical Forms and Landscape Properties

Due to the success of deep learning to solving a variety of challenging ...
research
10/21/2019

On Distributed Stochastic Gradient Algorithms for Global Optimization

The paper considers the problem of network-based computation of global m...
research
12/05/2019

Analysis of the Optimization Landscapes for Overcomplete Representation Learning

We study nonconvex optimization landscapes for learning overcomplete rep...
research
10/09/2019

Nearly Minimal Over-Parametrization of Shallow Neural Networks

A recent line of work has shown that an overparametrized neural network ...
research
10/22/2017

Iteratively reweighted ℓ_1 algorithms with extrapolation

Iteratively reweighted ℓ_1 algorithm is a popular algorithm for solving ...

Please sign up or login with your details

Forgot password? Click here to reset