Communication Complexity of Estimating Correlations
We characterize the communication complexity of the following distributed estimation problem. Alice and Bob observe infinitely many iid copies of ρ-correlated unit-variance (Gaussian or ±1 binary) random variables, with unknown ρ∈[-1,1]. By interactively exchanging k bits, Bob wants to produce an estimate ρ̂ of ρ. We show that the best possible performance (optimized over interaction protocol Π and estimator ρ̂) satisfies _Π,ρ̂_ρE [|ρ-ρ̂|^2] = Θ(1k). Furthermore, we show that the best possible unbiased estimator achieves performance of 1+o(1)2k 2. Curiously, thus, restricting communication to k bits results in (order-wise) similar minimax estimation error as restricting to k samples. Our results also imply an Ω(n) lower bound on the information complexity of the Gap-Hamming problem, for which we show a direct information-theoretic proof. Notably, the protocol achieving (almost) optimal performance is one-way (non-interactive). For one-way protocols we also prove the Ω(1k) bound even when ρ is restricted to any small open sub-interval of [-1,1] (i.e. a local minimax lower bound). this local behavior remains true in the interactive setting. Our proof techniques rely on symmetric strong data-processing inequalities, various tensorization techniques from information-theoretic interactive common-randomness extraction, and (for the local lower bound) on the Otto-Villani estimate for the Wasserstein-continuity of trajectories of the Ornstein-Uhlenbeck semigroup.
READ FULL TEXT