SGD Learns the Conjugate Kernel Class of the Network

02/27/2017
by   Amit Daniely, et al.
0

We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to learn, in polynomial time, a function that is competitive with the best function in the conjugate kernel space of the network, as defined in Daniely, Frostig and Singer. The result holds for log-depth networks from a rich family of architectures. To the best of our knowledge, it is the first polynomial-time guarantee for the standard neural network learning algorithm for networks of depth more that two. As corollaries, it follows that for neural networks of any depth between 2 and (n), SGD is guaranteed to learn, in polynomial time, constant degree polynomials with polynomially bounded coefficients. Likewise, it follows that SGD on large enough networks can learn any continuous function (not in polynomial time), complementing classical expressivity results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2019

Neural Networks Learning and Memorization with (almost) no Over-Parameterization

Many results in recent years established polynomial time learnability of...
research
09/29/2022

Neural Networks Efficiently Learn Low-Dimensional Representations with SGD

We study the problem of training a two-layer neural network (NN) of arbi...
research
09/01/2022

Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms

Neural Networks (NNs) struggle to efficiently learn certain problems, su...
research
07/22/2021

Local SGD Optimizes Overparameterized Neural Networks in Polynomial Time

In this paper we prove that Local (S)GD (or FedAvg) can optimize two-lay...
research
08/24/2021

The staircase property: How hierarchical structure can guide deep learning

This paper identifies a structural property of data distributions that e...
research
04/26/2013

An Algorithm for Training Polynomial Networks

We consider deep neural networks, in which the output of each node is a ...
research
09/03/2019

Contractility of continuous optimization

By introducing the concept of contractility, all the possible continuous...

Please sign up or login with your details

Forgot password? Click here to reset