Approximation Schemes for ReLU Regression

05/26/2020
by   Ilias Diakonikolas, et al.
0

We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution. We give the first efficient, constant-factor approximation algorithm for this problem assuming the underlying distribution satisfies some weak concentration and anti-concentration conditions (and includes, for example, all log-concave distributions). This solves the main open problem of Goel et al., who proved hardness results for any exact algorithm for ReLU regression (up to an additive ϵ). Using more sophisticated techniques, we can improve our results and obtain a polynomial-time approximation scheme for any subgaussian distribution. Given the aforementioned hardness results, these guarantees can not be substantially improved. Our main insight is a new characterization of surrogate losses for nonconvex activations. While prior work had established the existence of convex surrogates for monotone activations, we show that properties of the underlying distribution actually induce strong convexity for the loss, allowing us to relate the global minimum to the activation's Chow parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2019

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

We consider the problem of computing the best-fitting ReLU with respect ...
research
06/17/2022

Learning a Single Neuron with Adversarial Label Noise via Gradient Descent

We study the fundamental problem of learning a single neuron, i.e., a fu...
research
02/13/2023

Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces and ReLU Regression under Gaussian Marginals

We study the task of agnostically learning halfspaces under the Gaussian...
research
09/10/2021

ReLU Regression with Massart Noise

We study the fundamental problem of ReLU regression, where the goal is t...
research
07/21/2021

Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations

We present polynomial time and sample efficient algorithms for learning ...
research
01/20/2021

From Local Pseudorandom Generators to Hardness of Learning

We prove hardness-of-learning results under a well-studied assumption on...
research
06/18/2023

Agnostically Learning Single-Index Models using Omnipredictors

We give the first result for agnostically learning Single-Index Models (...

Please sign up or login with your details

Forgot password? Click here to reset