Optimal minimization of the covariance loss

05/03/2022
by   Vishesh Jain, et al.
0

Let X be a random vector valued in ℝ^m such that X_2≤ 1 almost surely. For every k≥ 3, we show that there exists a sigma algebra ℱ generated by a partition of ℝ^m into k sets such that Cov(X) - Cov(𝔼[X|ℱ]) _F≲1/√(logk). This is optimal up to the implicit constant and improves on a previous bound due to Boedihardjo, Strohmer, and Vershynin. Our proof provides an efficient algorithm for constructing ℱ and leads to improved accuracy guarantees for k-anonymous or differentially private synthetic data. We also establish a connection between the above problem of minimizing the covariance loss and the pinning lemma from statistical physics, providing an alternate (and much simpler) algorithmic proof in the important case when X ∈{± 1}^m/√(m) almost surely.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset