Privately Learning Mixtures of Axis-Aligned Gaussians
We consider the problem of learning mixtures of Gaussians under the constraint of approximate differential privacy. We prove that O(k^2 d log^3/2(1/δ) / α^2 ε) samples are sufficient to learn a mixture of k axis-aligned Gaussians in ℝ^d to within total variation distance α while satisfying (ε, δ)-differential privacy. This is the first result for privately learning mixtures of unbounded axis-aligned (or even unbounded univariate) Gaussians. If the covariance matrices of each of the Gaussians is the identity matrix, we show that O(kd/α^2 + kd log(1/δ) / αε) samples are sufficient. Recently, the "local covering" technique of Bun, Kamath, Steinke, and Wu has been successfully used for privately learning high-dimensional Gaussians with a known covariance matrix and extended to privately learning general high-dimensional Gaussians by Aden-Ali, Ashtiani, and Kamath. Given these positive results, this approach has been proposed as a promising direction for privately learning mixtures of Gaussians. Unfortunately, we show that this is not possible. We design a new technique for privately learning mixture distributions. A class of distributions ℱ is said to be list-decodable if there is an algorithm that, given "heavily corrupted" samples from f∈ℱ, outputs a list of distributions, ℱ, such that one of the distributions in ℱ approximates f. We show that if ℱ is privately list-decodable, then we can privately learn mixtures of distributions in ℱ. Finally, we show axis-aligned Gaussian distributions are privately list-decodable, thereby proving mixtures of such distributions are privately learnable.
READ FULL TEXT