Are My EHRs Private Enough? -Event-level Privacy Protection

06/12/2018
by   Chengsheng Mao, et al.
0

Privacy is a major concern in sharing human subject data to researchers for secondary analyses. A simple binary consent (opt-in or not) may significantly reduce the amount of sharable data, since many patients might only be concerned about a few sensitive medical conditions rather than the entire medical records. We propose event-level privacy protection, and develop a feature ablation method to protect event-level privacy in electronic medical records. Using a list of 13 sensitive diagnoses, we evaluate the feasibility and the efficacy of the proposed method. As feature ablation progresses, the identifiability of a sensitive medical condition decreases with varying speeds on different diseases. We find that these sensitive diagnoses can be divided into 3 categories: (1) 5 diseases have fast declining identifiability (AUC below 0.6 with less than 400 features excluded); (2) 7 diseases with progressively declining identifiability (AUC below 0.7 with between 200 and 700 features excluded); and (3) 1 disease with slowly declining identifiability (AUC above 0.7 with 1000 features excluded). The fact that the majority (12 out of 13) of the sensitive diseases fall into the first two categories suggests the potential of the proposed feature ablation method as a solution for event-level record privacy protection.

READ FULL TEXT

page 6

page 7

page 8

research
04/29/2022

A hybrid privacy protection scheme for medical data

Healthcare data contains sensitive information, and it is challenging to...
research
05/16/2019

To Warn or Not to Warn: Online Signaling in Audit Games

Routine operational use of sensitive data is commonly governed by laws a...
research
11/19/2022

Anonymizing Periodical Releases of SRS Data by Fusing Differential Privacy

Spontaneous reporting systems (SRS) have been developed to collect adver...
research
09/04/2018

Bayesian Double Feature Allocation for Phenotyping with Electronic Health Records

We propose a categorical matrix factorization method to infer latent dis...
research
12/06/2018

Generation of Synthetic Electronic Medical Record Text

Machine learning (ML) and Natural Language Processing (NLP) have achieve...
research
01/26/2020

Secondary Use of Electronic Health Record: Opportunities and Challenges

In present technological era, healthcare providers generate huge amount ...
research
04/29/2021

Leveraging Online Shopping Behaviors as a Proxy for Personal Lifestyle Choices: New Insights into Chronic Disease Prevention Literacy

Ubiquitous internet access is reshaping the way we live, but it is accom...

Please sign up or login with your details

Forgot password? Click here to reset