SQ Lower Bounds for Learning Single Neurons with Massart Noise
We study the problem of PAC learning a single neuron in the presence of Massart noise. Specifically, for a known activation function f: ℝ→ℝ, the learner is given access to labeled examples (𝐱, y) ∈ℝ^d ×ℝ, where the marginal distribution of 𝐱 is arbitrary and the corresponding label y is a Massart corruption of f(⟨𝐰, 𝐱⟩). The goal of the learner is to output a hypothesis h: ℝ^d →ℝ with small squared loss. For a range of activation functions, including ReLUs, we establish super-polynomial Statistical Query (SQ) lower bounds for this learning problem. In more detail, we prove that no efficient SQ algorithm can approximate the optimal error within any constant factor. Our main technical contribution is a novel SQ-hard construction for learning {± 1}-weight Massart halfspaces on the Boolean hypercube that is interesting on its own right.
READ FULL TEXT