Hidden Integrality of SDP Relaxation for Sub-Gaussian Mixture Models
We consider the problem of estimating the discrete clustering structures under Sub-Gaussian Mixture Models. Our main results establish a hidden integrality property of a semidefinite programming (SDP) relaxation for this problem: while the optimal solutions to the SDP are not integer-valued in general, their estimation errors can be upper bounded in terms of the error of an idealized integer program. The error of the integer program, and hence that of the SDP, are further shown to decay exponentially in the signal-to-noise ratio. To the best of our knowledge, this is the first exponentially decaying error bound for convex relaxations of mixture models, and our results reveal the "global-to-local" mechanism that drives the performance of the SDP relaxation. A corollary of our results shows that in certain regimes the SDP solutions are in fact integral and exact, improving on existing exact recovery results for convex relaxations. More generally, our results establish sufficient conditions for the SDP to correctly recover the cluster memberships of (1-δ) fraction of the points for any δ∈(0,1). As a special case, we show that under the d-dimensional Stochastic Ball Model, SDP achieves non-trivial (sometimes exact) recovery when the center separation is as small as √(1/d), which complements previous exact recovery results that require constant separation.
READ FULL TEXT