Distributed learning optimisation of Cox models can leak patient data: Risks and solutions

04/12/2022
by   Carsten Brink, et al.
0

Medical data are often highly sensitive, and frequently there are missing data. Due to the data's sensitive nature, there is an interest in creating modelling methods where the data are kept in each local centre to preserve their privacy, but yet the model can be trained on and learn from data across multiple centres. Such an approach might be distributed machine learning (federated learning, collaborative learning) in which a model is iteratively calculated based on aggregated local model information from each centre. However, even though no specific data are leaving the centre, there is a potential risk that the exchanged information is sufficient to reconstruct all or part of the patient data, which would hamper the safety-protecting rationale idea of distributed learning. This paper demonstrates that the optimisation of a Cox survival model can lead to patient data leakage. Following this, we suggest a way to optimise and validate a Cox model that avoids these problems in a secure way. The feasibility of the suggested method is demonstrated in a provided Matlab code that also includes methods for handling missing data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2019

Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence

A patient's health information is generally fragmented across silos. Tho...
research
05/02/2021

GRNN: Generative Regression Neural Network – A Data Leakage Attack for Federated Learning

Data privacy has become an increasingly important issue in machine learn...
research
06/20/2022

Decentralized Distributed Learning with Privacy-Preserving Data Synthesis

In the medical field, multi-center collaborations are often sought to yi...
research
10/02/2019

Privacy-preserving Federated Brain Tumour Segmentation

Due to medical data privacy regulations, it is often infeasible to colle...
research
04/17/2023

Fed-MIWAE: Federated Imputation of Incomplete Data via Deep Generative Models

Federated learning allows for the training of machine learning models on...
research
12/29/2020

Privacy-Preserving Methods for Vertically Partitioned Incomplete Data

Distributed health data networks that use information from multiple sour...

Please sign up or login with your details

Forgot password? Click here to reset