The loss surface of deep linear networks viewed through the algebraic geometry lens

10/17/2018
by   Dhagash Mehta, et al.
0

By using the viewpoint of modern computational algebraic geometry, we explore properties of the optimization landscapes of the deep linear neural network models. After clarifying on the various definitions of "flat" minima, we show that the geometrically flat minima, which are merely artifacts of residual continuous symmetries of the deep linear networks, can be straightforwardly removed by a generalized L_2 regularization. Then, we establish upper bounds on the number of isolated stationary points of these networks with the help of algebraic geometry. Using these upper bounds and utilizing a numerical algebraic geometry method, we find all stationary points of modest depth and matrix size. We show that in the presence of the non-zero regularization, deep linear networks indeed possess local minima which are not the global minima. Our computational results clarify certain aspects of the loss surfaces of deep linear networks and provide novel insights.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2017

Deep linear neural networks with arbitrary loss: All local minima are global

We consider deep linear networks with arbitrary differentiable loss. We ...
research
02/27/2017

Depth Creates No Bad Local Minima

In deep learning, depth, as well as nonlinearity, create non-convex loss...
research
08/01/2018

Geometry of energy landscapes and the optimizability of deep neural networks

Deep neural networks are workhorse models in machine learning with multi...
research
07/09/2019

Are deep ResNets provably better than linear predictors?

Recently, a residual network (ResNet) with a single residual block has b...
research
01/15/2019

Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis

The notion of flat minima has played a key role in the generalization pr...
research
07/27/2007

Families of dendrograms

A conceptual framework for cluster analysis from the viewpoint of p-adic...
research
11/30/2018

Measure, Manifold, Learning, and Optimization: A Theory Of Neural Networks

We present a formal measure-theoretical theory of neural networks (NN) b...

Please sign up or login with your details

Forgot password? Click here to reset