List Decodable Learning via Sum of Squares
In the list-decodable learning setup, an overwhelming majority (say a 1-β-fraction) of the input data consists of outliers and the goal of an algorithm is to output a small list L of hypotheses such that one of them agrees with inliers. We develop a framework for list-decodable learning via the Sum-of-Squares SDP hierarchy and demonstrate it on two basic statistical estimation problems Linear regression: Suppose we are given labelled examples {(X_i,y_i)}_i ∈ [N] containing a subset S of β N inliers {X_i }_i ∈ S that are drawn i.i.d. from standard Gaussian distribution N(0,I) in R^d, where the corresponding labels y_i are well-approximated by a linear function ℓ. We devise an algorithm that outputs a list L of linear functions such that there exists some ℓ̂∈L that is close to ℓ. This yields the first algorithm for linear regression in a list-decodable setting. Our results hold for any distribution of examples whose concentration and anticoncentration can be certified by Sum-of-Squares proofs. Mean Estimation: Given data points {X_i}_i ∈ [N] containing a subset S of β N inliers {X_i }_i ∈ S that are drawn i.i.d. from a Gaussian distribution N(μ,I) in R^d, we devise an algorithm that generates a list L of means such that there exists μ̂∈L close to μ. The recovery guarantees of the algorithm are analogous to the existing algorithms for the problem by Diakonikolas and Kothari . In an independent and concurrent work, Karmalkar KlivansKS19 also obtain an algorithm for list-decodable linear regression using the Sum-of-Squares SDP hierarchy.
READ FULL TEXT