Heterogeneity-aware and communication-efficient distributed statistical inference

12/20/2019
by   Rui Duan, et al.
0

In multicenter research, individual-level data are often protected against sharing across sites. To overcome the barrier of data sharing, many distributed algorithms, which only require sharing aggregated information, have been developed. The existing distributed algorithms usually assume the data are homogeneously distributed across sites. This assumption ignores the important fact that the data collected at different sites may come from various sub-populations and environments, which can lead to heterogeneity in the distribution of the data. Ignoring the heterogeneity may lead to erroneous statistical inference. In this paper, we propose distributed algorithms which account for the heterogeneous distributions by allowing site-specific nuisance parameters. The proposed methods extend the surrogate likelihood approach to the heterogeneous setting by applying a novel density ratio tilting method to the efficient score function. The proposed algorithms maintain same communication cost as the existing communication-efficient algorithms. We establish the non-asymptotic risk bound of the proposed distributed estimator and its limiting distribution in the two-index asymptotic setting. In addition, we show that the asymptotic variance of the estimator attains the Cramér-Rao lower bound. Finally, the simulation study shows the proposed algorithms reach higher estimation accuracy compared to several existing methods.

READ FULL TEXT
research
05/25/2016

Communication-Efficient Distributed Statistical Inference

We present a Communication-efficient Surrogate Likelihood (CSL) framewor...
research
01/17/2020

Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates

Distributed statistical inference has recently attracted immense attenti...
research
05/04/2022

Validating Approximate Slope Homogeneity in Large Panels

Statistical inference for large data panels is omnipresent in modern eco...
research
05/29/2018

Distributed Statistical Inference for Massive Data

This paper considers distributed statistical inference for general symme...
research
11/04/2015

A Distributed One-Step Estimator

Distributed statistical inference has recently attracted enormous attent...
research
05/25/2020

Towards Efficient Scheduling of Federated Mobile Devices under Computational and Statistical Heterogeneity

Originated from distributed learning, federated learning enables privacy...
research
05/05/2019

Fast communication-efficient spectral clustering over distributed data

The last decades have seen a surge of interests in distributed computing...

Please sign up or login with your details

Forgot password? Click here to reset