Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials

07/24/2023
by   Ilias Diakonikolas, et al.
0

We study the problem of PAC learning a linear combination of k ReLU activations under the standard Gaussian distribution on ℝ^d with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity (dk/ϵ)^O(k), where ϵ>0 is the target accuracy. Prior work had given an algorithm for this problem with complexity (dk/ϵ)^h(k), where the function h(k) scales super-polynomially in k. Interestingly, the complexity of our algorithm is near-optimal within the class of Correlational Statistical Query algorithms. At a high-level, our algorithm uses tensor decomposition to identify a subspace such that all the O(k)-order moments are small in the orthogonal directions. Its analysis makes essential use of the theory of Schur polynomials to show that the higher-moment error tensors are small given that the lower-order ones are.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2023

Learning Narrow One-Hidden-Layer ReLU Networks

We consider the well-studied problem of learning a linear combination of...
research
02/10/2021

Agnostic Proper Learning of Halfspaces under Gaussian Marginals

We study the problem of agnostically learning halfspaces under the Gauss...
research
02/10/2022

Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks

We give superpolynomial statistical query (SQ) lower bounds for learning...
research
06/22/2020

Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks

We study the problem of PAC learning one-hidden-layer ReLU networks with...
research
11/04/2019

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

We consider the problem of computing the best-fitting ReLU with respect ...
research
05/31/2022

Learning (Very) Simple Generative Models Is Hard

Motivated by the recent empirical successes of deep generative models, w...
research
12/14/2020

Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models

Let V be any vector space of multivariate degree-d homogeneous polynomia...

Please sign up or login with your details

Forgot password? Click here to reset