On the Quality of the Initial Basin in Overspecified Neural Networks

11/13/2015
by   Itay Safran, et al.
0

Deep learning, in the form of artificial neural networks, has achieved remarkable practical success in recent years, for a variety of difficult machine learning applications. However, a theoretical explanation for this remains a major open problem, since training neural networks involves optimizing a highly non-convex objective function, and is known to be computationally hard in the worst case. In this work, we study the geometric structure of the associated non-convex objective function, in the context of ReLU networks and starting from a random initialization of the network parameters. We identify some conditions under which it becomes more favorable to optimization, in the sense of (i) High probability of initializing at a point from which there is a monotonically decreasing path to a global minimum; and (ii) High probability of initializing at a basin (suitably defined) with a small minimal objective value. A common theme in our results is that such properties are more likely to hold for larger ("overspecified") networks, which accords with some recent empirical and theoretical observations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2016

Mollifying Networks

The optimization of deep neural networks can be more challenging than tr...
research
05/31/2023

Optimal Sets and Solution Paths of ReLU Networks

We develop an analytical framework to characterize the set of optimal Re...
research
02/18/2016

Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity

We develop a general duality between neural networks and compositional k...
research
12/11/2018

On the Ineffectiveness of Variance Reduced Optimization for Deep Learning

The application of stochastic variance reduction to optimization has sho...
research
07/17/2016

Piecewise convexity of artificial neural networks

Although artificial neural networks have shown great promise in applicat...
research
07/31/2021

The Separation Capacity of Random Neural Networks

Neural networks with random weights appear in a variety of machine learn...
research
06/20/2023

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths

Understanding the optimization dynamics of neural networks is necessary ...

Please sign up or login with your details

Forgot password? Click here to reset