Exploratory Analysis of Federated Learning Methods with Differential Privacy on MIMIC-III

02/08/2023
by   Aron N. Horvath, et al.
0

Background: Federated learning methods offer the possibility of training machine learning models on privacy-sensitive data sets, which cannot be easily shared. Multiple regulations pose strict requirements on the storage and usage of healthcare data, leading to data being in silos (i.e. locked-in at healthcare facilities). The application of federated algorithms on these datasets could accelerate disease diagnostic, drug development, as well as improve patient care. Methods: We present an extensive evaluation of the impact of different federation and differential privacy techniques when training models on the open-source MIMIC-III dataset. We analyze a set of parameters influencing a federated model performance, namely data distribution (homogeneous and heterogeneous), communication strategies (communication rounds vs. local training epochs), federation strategies (FedAvg vs. FedProx). Furthermore, we assess and compare two differential privacy (DP) techniques during model training: a stochastic gradient descent-based differential privacy algorithm (DP-SGD), and a sparse vector differential privacy technique (DP-SVT). Results: Our experiments show that extreme data distributions across sites (imbalance either in the number of patients or the positive label ratios between sites) lead to a deterioration of model performance when trained using the FedAvg strategy. This issue is resolved when using FedProx with the use of appropriate hyperparameter tuning. Furthermore, the results show that both differential privacy techniques can reach model performances similar to those of models trained without DP, however at the expense of a large quantifiable privacy leakage. Conclusions: We evaluate empirically the benefits of two federation strategies and propose optimal strategies for the choice of parameters when using differential privacy techniques.

READ FULL TEXT

page 10

page 11

page 12

research
08/21/2023

Adaptive Local Steps Federated Learning with Differential Privacy Driven by Convergence Analysis

Federated Learning (FL) is a distributed machine learning technique that...
research
02/21/2020

Anonymizing Data for Privacy-Preserving Federated Learning

Federated learning enables training a global machine learning model from...
research
06/07/2022

Subject Granular Differential Privacy in Federated Learning

This paper introduces subject granular privacy in the Federated Learning...
research
04/04/2023

Privacy-Preserving Federated Discovery of DNA Motifs with Differential Privacy

DNA motif discovery is an important issue in gene research, which aims t...
research
04/26/2022

Federated Stochastic Primal-dual Learning with Differential Privacy

Federated learning (FL) is a new paradigm that enables many clients to j...
research
05/10/2023

Securing Distributed SGD against Gradient Leakage Threats

This paper presents a holistic approach to gradient leakage resilient di...
research
03/24/2020

FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection

As massive data are produced from small gadgets, federated learning on m...

Please sign up or login with your details

Forgot password? Click here to reset