Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods

10/28/2016
by   Antoine Gautier, et al.
0

The optimization problem behind neural networks is highly non-convex. Training with stochastic gradient descent and variants requires careful parameter tuning and provides no guarantee to achieve the global optimum. In contrast we show under quite weak assumptions on the data that a particular class of feedforward neural networks can be trained globally optimal with a linear convergence rate with our nonlinear spectral method. Up to our knowledge this is the first practically feasible method which achieves such a guarantee. While the method can in principle be applied to deep networks, we restrict ourselves for simplicity in this paper to one and two hidden layer networks. Our experiments confirm that these models are rich enough to achieve good performance on a series of real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2018

On the Convergence Rate of Training Recurrent Neural Networks

Despite the huge success of deep learning, our understanding to how the ...
research
02/26/2017

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Deep learning models are often successfully trained using gradient desce...
research
04/26/2017

The loss surface of deep and wide neural networks

While the optimization problem behind deep neural networks is highly non...
research
07/19/2022

Approximation Power of Deep Neural Networks: an explanatory mathematical survey

The goal of this survey is to present an explanatory review of the appro...
research
11/20/2017

Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

By lifting the ReLU function into a higher dimensional space, we develop...
research
11/21/2022

Neural networks trained with SGD learn distributions of increasing complexity

The ability of deep neural networks to generalise well even when they in...
research
03/26/2018

A Provably Correct Algorithm for Deep Learning that Actually Works

We describe a layer-by-layer algorithm for training deep convolutional n...

Please sign up or login with your details

Forgot password? Click here to reset