Evaluation of the Synthetic Electronic Health Records

10/16/2022
by   Emily Muller, et al.
0

Generative models have been found effective for data synthesis due to their ability to capture complex underlying data distributions. The quality of generated data from these models is commonly evaluated by visual inspection for image datasets or downstream analytical tasks for tabular datasets. These evaluation methods neither measure the implicit data distribution nor consider the data privacy issues, and it remains an open question of how to compare and rank different generative models. Medical data can be sensitive, so it is of great importance to draw privacy concerns of patients while maintaining the data utility of the synthetic dataset. Beyond the utility evaluation, this work outlines two metrics called Similarity and Uniqueness for sample-wise assessment of synthetic datasets. We demonstrate the proposed notions with several state-of-the-art generative models to synthesise Cystic Fibrosis (CF) patients' electronic health records (EHRs), observing that the proposed metrics are suitable for synthetic data evaluation and generative model comparison.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2023

Synthesizing Mixed-type Electronic Health Records using Diffusion Models

Electronic Health Records (EHRs) contain sensitive patient information, ...
research
01/14/2022

Synthesising Electronic Health Records: Cystic Fibrosis Patient Group

Class imbalance can often degrade predictive performance of supervised l...
research
10/14/2022

Quantifying Quality of Class-Conditional Generative Models in Time-Series Domain

Generative models are designed to address the data scarcity problem. Eve...
research
06/20/2023

Diverse Community Data for Benchmarking Data Privacy Algorithms

The Diverse Communities Data Excerpts are the core of a National Institu...
research
09/28/2022

medigan: A Python Library of Pretrained Generative Models for Enriched Data Access in Medical Imaging

Synthetic data generated by generative models can enhance the performanc...
research
10/28/2022

Evaluation of Categorical Generative Models – Bridging the Gap Between Real and Synthetic Data

The machine learning community has mainly relied on real data to benchma...
research
03/11/2020

Deep generative models in DataSHIELD

The best way to calculate statistics from medical data is to use the dat...

Please sign up or login with your details

Forgot password? Click here to reset