Log In Sign Up

Breaking The Dimension Dependence in Sparse Distribution Estimation under Communication Constraints

by   Wei-Ning Chen, et al.

We consider the problem of estimating a d-dimensional s-sparse discrete distribution from its samples observed under a b-bit communication constraint. The best-known previous result on ℓ_2 estimation error for this problem is O( slog( d/s)/n2^b). Surprisingly, we show that when sample size n exceeds a minimum threshold n^*(s, d, b), we can achieve an ℓ_2 estimation error of O( s/n2^b). This implies that when n>n^*(s, d, b) the convergence rate does not depend on the ambient dimension d and is the same as knowing the support of the distribution beforehand. We next ask the question: “what is the minimum n^*(s, d, b) that allows dimension-free convergence?”. To upper bound n^*(s, d, b), we develop novel localization schemes to accurately and efficiently localize the unknown support. For the non-interactive setting, we show that n^*(s, d, b) = O( min( d^2log^2 d/2^b, s^4log^2 d/2^b) ). Moreover, we connect the problem with non-adaptive group testing and obtain a polynomial-time estimation scheme when n = Ω̃(s^4log^4 d/2^b). This group testing based scheme is adaptive to the sparsity parameter s, and hence can be applied without knowing it. For the interactive setting, we propose a novel tree-based estimation scheme and show that the minimum sample-size needed to achieve dimension-free convergence can be further reduced to n^*(s, d, b) = Õ( s^2log^2 d/2^b).


page 1

page 2

page 3

page 4


Pointwise Bounds for Distribution Estimation under Communication Constraints

We consider the problem of estimating a d-dimensional discrete distribut...

List-Decodable Covariance Estimation

We give the first polynomial time algorithm for list-decodable covarianc...

Mean Estimation from One-Bit Measurements

We consider the problem of estimating the mean of a symmetric log-concav...

Sparse Uniformity Testing

In this paper we consider the uniformity testing problem for high-dimens...

Some effects in adaptive robust estimation under sparsity

Adaptive estimation in the sparse mean model and in sparse regression ex...

Geometric Lower Bounds for Distributed Parameter Estimation under Communication Constraints

We consider parameter estimation in distributed networks, where each nod...

Temporally-Biased Sampling for Online Model Management

To maintain the accuracy of supervised learning models in the presence o...