Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent

06/22/2020
by   Surbhi Goel, et al.
0

We prove the first superpolynomial lower bounds for learning one-layer neural networks with respect to the Gaussian distribution using gradient descent. We show that any classifier trained using gradient descent with respect to square-loss will fail to achieve small test error in polynomial time given access to samples labeled by a one-layer neural network. For classification, we give a stronger result, namely that any statistical query (SQ) algorithm (including gradient descent) will fail to achieve small test error in polynomial time. Prior work held only for gradient descent run with small batch sizes, required sharp activations, and applied to specific classes of queries. Our lower bounds hold for broad classes of activations including ReLU and sigmoid. The core of our result relies on a novel construction of a simple family of neural networks that are exactly orthogonal with respect to all spherically symmetric distributions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2020

Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK

We consider the dynamic of gradient descent for learning a two-layer neu...
research
02/26/2017

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Deep learning models are often successfully trained using gradient desce...
research
11/04/2019

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

We consider the problem of computing the best-fitting ReLU with respect ...
research
09/26/2019

Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks

Recent work has revealed that overparameterized networks trained by grad...
research
11/18/2020

Gradient Starvation: A Learning Proclivity in Neural Networks

We identify and formalize a fundamental gradient descent phenomenon resu...
research
07/26/2022

Quiver neural networks

We develop a uniform theoretical approach towards the analysis of variou...
research
03/26/2021

Lower Bounds on the Generalization Error of Nonlinear Learning Models

We study in this paper lower bounds for the generalization error of mode...

Please sign up or login with your details

Forgot password? Click here to reset