Privacy-Preserving Methods for Vertically Partitioned Incomplete Data

12/29/2020
by   Yi Deng, et al.
0

Distributed health data networks that use information from multiple sources have drawn substantial interest in recent years. However, missing data are prevalent in such networks and present significant analytical challenges. The current state-of-the-art methods for handling missing data require pooling data into a central repository before analysis, which may not be possible in a distributed health data network. In this paper, we propose a privacy-preserving distributed analysis framework for handling missing data when data are vertically partitioned. In this framework, each institution with a particular data source utilizes the local private data to calculate necessary intermediate aggregated statistics, which are then shared to build a global model for handling missing data. To evaluate our proposed methods, we conduct simulation studies that clearly demonstrate that the proposed privacy-preserving methods perform as well as the methods using the pooled data and outperform several naïve methods. We further illustrate the proposed methods through the analysis of a real dataset. The proposed framework for handling vertically partitioned incomplete data is substantially more privacy-preserving than methods that require pooling of the data, since no individual-level data are shared, which can lower hurdles for collaboration across multiple institutions and build stronger public trust.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2020

Privacy-Preserving Distributed Learning Framework for 6G Telecom Ecosystems

We present a privacy-preserving distributed learning framework for telec...
research
02/26/2020

Privacy-Preserving Distributed Clustering for Electrical Load Profiling

Electrical load profiling supports retailers and distribution network op...
research
02/13/2020

BiSample: Bidirectional Sampling for Handling Missing Data with Local Differential Privacy

Local differential privacy (LDP) has received much interest recently. In...
research
02/20/2019

Data collaboration analysis for distributed datasets

In this paper, we propose a data collaboration analysis method for distr...
research
10/31/2022

VertiBayes: Learning Bayesian network parameters from vertically partitioned data with missing values

Federated learning makes it possible to train a machine learning model o...
research
07/27/2018

An Algorithm for Learning Shape and Appearance Models without Annotations

This paper presents a framework for automatically learning shape and app...
research
04/12/2022

Distributed learning optimisation of Cox models can leak patient data: Risks and solutions

Medical data are often highly sensitive, and frequently there are missin...

Please sign up or login with your details

Forgot password? Click here to reset