Threshold Phenomena in Learning Halfspaces with Massart Noise
We study the problem of PAC learning halfspaces on ℝ^d with Massart noise under Gaussian marginals. In the Massart noise model, an adversary is allowed to flip the label of each point 𝐱 with probability η(𝐱) ≤η, for some parameter η∈ [0,1/2]. The goal of the learner is to output a hypothesis with missclassification error opt + ϵ, where opt is the error of the target halfspace. Prior work studied this problem assuming that the target halfspace is homogeneous and that the parameter η is strictly smaller than 1/2. We explore how the complexity of the problem changes when either of these assumptions is removed, establishing the following threshold phenomena: For η = 1/2, we prove a lower bound of d^Ω (log(1/ϵ)) on the complexity of any Statistical Query (SQ) algorithm for the problem, which holds even for homogeneous halfspaces. On the positive side, we give a new learning algorithm for arbitrary halfspaces in this regime with sample complexity and running time O_ϵ(1) d^O(log(1/ϵ)). For η <1/2, we establish a lower bound of d^Ω(log(1/γ)) on the SQ complexity of the problem, where γ = max{ϵ, min{𝐏𝐫[f(𝐱) = 1], 𝐏𝐫[f(𝐱) = -1]}} and f is the target halfspace. In particular, this implies an SQ lower bound of d^Ω (log(1/ϵ) ) for learning arbitrary Massart halfspaces (even for small constant η). We complement this lower bound with a new learning algorithm for this regime with sample complexity and runtime d^O_η(log(1/γ))poly(1/ϵ). Taken together, our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
READ FULL TEXT