Qualitatively characterizing neural network optimization problems

12/19/2014
by   Ian J. Goodfellow, et al.
0

Training neural networks involves solving large-scale non-convex optimization problems. This task has long been believed to be extremely difficult, with fear of local minima and other obstacles motivating a variety of schemes to improve optimization, such as unsupervised pretraining. However, modern neural networks are able to achieve negligible training error on complex tasks, using only direct training with stochastic gradient descent. We introduce a simple analysis technique to look for evidence that such networks are overcoming local optima. We find that, in fact, on a straight path from initialization to solution, a variety of state of the art neural networks never encounter any significant obstacles.

READ FULL TEXT

page 12

page 15

page 16

page 17

page 18

research
11/03/2016

Finding Approximate Local Minima Faster than Gradient Descent

We design a non-convex second-order optimization algorithm that is guara...
research
02/03/2020

A Relaxation Argument for Optimization in Neural Networks and Non-Convex Compressed Sensing

It has been observed in practical applications and in theoretical analys...
research
12/12/2020

Revisiting "Qualitatively Characterizing Neural Network Optimization Problems"

We revisit and extend the experiments of Goodfellow et al. (2014), who s...
research
01/28/2020

MSE-Optimal Neural Network Initialization via Layer Fusion

Deep neural networks achieve state-of-the-art performance for a range of...
research
06/20/2023

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths

Understanding the optimization dynamics of neural networks is necessary ...
research
05/21/2018

Never look back - The EnKF method and its application to the training of neural networks without back propagation

In this work, we present a new derivative-free optimization method and i...
research
05/23/2019

Parsimonious Deep Learning: A Differential Inclusion Approach with Global Convergence

Over-parameterization is ubiquitous nowadays in training neural networks...

Please sign up or login with your details

Forgot password? Click here to reset