Federated Statistical Analysis: Non-parametric Testing and Quantile Estimation

08/20/2023
by   Ori Becher, et al.
0

The age of big data has fueled expectations for accelerating learning. The availability of large data sets enables researchers to achieve more powerful statistical analyses and enhances the reliability of conclusions, which can be based on a broad collection of subjects. Often such data sets can be assembled only with access to diverse sources; for example, medical research that combines data from multiple centers in a federated analysis. However these hopes must be balanced against data privacy concerns, which hinder sharing raw data among centers. Consequently, federated analyses typically resort to sharing data summaries from each center. The limitation to summaries carries the risk that it will impair the efficiency of statistical analysis procedures. In this work we take a close look at the effects of federated analysis on two very basic problems, nonparametric comparison of two groups and quantile estimation to describe the corresponding distributions. We also propose a specific privacy-preserving data release policy for federated analysis with the K-anonymity criterion, which has been adopted by the Medical Informatics Platform of the European Human Brain Project. Our results show that, for our tasks, there is only a modest loss of statistical efficiency.

READ FULL TEXT

page 1

page 13

page 16

research
07/25/2021

Federated Causal Inference in Heterogeneous Observational Data

Analyzing observational data from multiple sources can be useful for inc...
research
11/20/2022

Federated deep transfer learning for EEG decoding using multiple BCI tasks

Deep learning has been successful in BCI decoding. However, it is very d...
research
02/15/2023

Bayesian Federated Inference for Statistical Models

Identifying predictive factors via multivariable statistical analysis is...
research
08/13/2018

Review of Different Privacy Preserving Techniques in PPDP

Big data is a term used for a very large data sets that have many diffic...
research
05/24/2019

Federated Forest

Most real-world data are scattered across different companies or governm...
research
10/04/2019

Privacy Preserving Stochastic Channel-Based Federated Learning with Neural Network Pruning

Artificial neural network has achieved unprecedented success in a wide v...
research
11/03/2022

Towards federated multivariate statistical process control (FedMSPC)

The ongoing transition from a linear (produce-use-dispose) to a circular...

Please sign up or login with your details

Forgot password? Click here to reset