Distributed non-disclosive validation of predictive models by a modified ROC-GLM

03/21/2022
by   Daniel Schalk, et al.
0

Distributed statistical analyses provide a promising approach for privacy protection when analysing data distributed over several databases. It brings the analysis to the data and not the data to the analysis. The analyst receives anonymous summary statistics which are combined to a aggregated result. We are interested to calculate the AUC of a prediction score based on a distributed approach without getting to know the data of involved individual subjects distributed over different databases. We use DataSHIELD as the technology to carry out distributed analyses and use a newly developed algorithms to perform the validation of the prediction score. Calibration can easily be implemented in the distributed setting. But, discrimination represented by a respective ROC curve and its AUC is challenging. We base our approach on the ROC-GLM algorithm as well as on ideas of differential privacy. The proposed algorithms are evaluated in a simulation study. A real-word application is described: The audit use case of DIFUTURE (Medical Informatics Initiative) with the goal to validate a treatment prediction rule of patients with newly diagnosed multiple sclerosis.

READ FULL TEXT
research
10/22/2022

Federated Calibration and Evaluation of Binary Classifiers

We address two major obstacles to practical use of supervised classifier...
research
03/11/2020

Deep generative models in DataSHIELD

The best way to calculate statistics from medical data is to use the dat...
research
06/27/2019

Distributed Clustering in the Anonymized Space with Local Differential Privacy

Clustering and analyzing on collected data can improve user experiences ...
research
03/07/2017

Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods

The optimal learner for prediction modeling varies depending on the unde...
research
03/01/2018

Distributed multivariable modeling for signature development under data protection constraints

Data protection constraints frequently require distributed analysis of d...

Please sign up or login with your details

Forgot password? Click here to reset