Distributed Nonparametric Estimation under Communication Constraints
In the era of big data, it is necessary to split extremely large data sets across multiple computing nodes and construct estimators using the distributed data. When designing distributed estimators, it is desirable to minimize the amount of communication across the network because transmission between computers is slow in comparison to computations in a single computer. Our work provides a general framework for understanding the behavior of distributed estimation under communication constraints for nonparametric problems. We provide results for a broad class of models, moving beyond the Gaussian framework that dominates the literature. As concrete examples we derive minimax lower and matching upper bounds in the distributed regression, density estimation, classification, Poisson regression and volatility estimation models under communication constraints. To assist with this, we provide sufficient conditions that can be easily verified in all of our examples.
READ FULL TEXT