A Provably Correct Algorithm for Deep Learning that Actually Works

03/26/2018
by   Eran Malach, et al.
0

We describe a layer-by-layer algorithm for training deep convolutional networks, where each step involves gradient updates for a two layer network followed by a simple clustering algorithm. Our algorithm stems from a deep generative model that generates mages level by level, where lower resolution images correspond to latent semantic classes. We analyze the convergence rate of our algorithm assuming that the data is indeed generated according to this model (as well as additional assumptions). While we do not pretend to claim that the assumptions are realistic for natural images, we do believe that they capture some true properties of real data. Furthermore, we show that our algorithm actually works in practice (on the CIFAR dataset), achieving results in the same ballpark as that of vanilla convolutional neural networks that are being trained by stochastic gradient descent. Finally, our proof techniques may be of independent interest.

READ FULL TEXT
research
07/08/2016

Overcoming Challenges in Fixed Point Training of Deep Convolutional Networks

It is known that training deep neural networks, in particular, deep conv...
research
05/01/2021

One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks

There has been a recent surge of interest in understanding the convergen...
research
12/20/2022

Normalized Stochastic Gradient Descent Training of Deep Neural Networks

In this paper, we introduce a novel optimization algorithm for machine l...
research
11/29/2016

Gossip training for deep learning

We address the issue of speeding up the training of convolutional networ...
research
06/25/2020

The Gaussian equivalence of generative models for learning with two-layer neural networks

Understanding the impact of data structure on learning in neural network...
research
10/28/2016

Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods

The optimization problem behind neural networks is highly non-convex. Tr...

Please sign up or login with your details

Forgot password? Click here to reset