Synthesizing Mixed-type Electronic Health Records using Diffusion Models

02/28/2023
by   Taha Ceritli, et al.
10

Electronic Health Records (EHRs) contain sensitive patient information, which presents privacy concerns when sharing such data. Synthetic data generation is a promising solution to mitigate these risks, often relying on deep generative models such as Generative Adversarial Networks (GANs). However, recent studies have shown that diffusion models offer several advantages over GANs, such as generation of more realistic synthetic data and stable training in generating data modalities, including image, text, and sound. In this work, we investigate the potential of diffusion models for generating realistic mixed-type tabular EHRs, comparing TabDDPM model with existing methods on four datasets in terms of data quality, utility, privacy, and augmentation. Our experiments demonstrate that TabDDPM outperforms the state-of-the-art models across all evaluation metrics, except for privacy, which confirms the trade-off between privacy and utility.

READ FULL TEXT
research
02/08/2023

MedDiff: Generating Electronic Health Records using Accelerated Denoising Diffusion Model

Due to patient privacy protection concerns, machine learning research in...
research
10/16/2022

Evaluation of the Synthetic Electronic Health Records

Generative models have been found effective for data synthesis due to th...
research
03/22/2023

Synthetic Health-related Longitudinal Data with Mixed-type Variables Generated using Diffusion Models

This paper presents a novel approach to simulating electronic health rec...
research
12/22/2021

Generating Synthetic Mixed-type Longitudinal Electronic Health Records for Artificial Intelligent Applications

The recent availability of electronic health records (EHRs) have provide...
research
06/01/2018

Natural Language Generation for Electronic Health Records

A variety of methods existing for generating synthetic electronic health...
research
09/04/2023

FinDiff: Diffusion Models for Financial Tabular Data Generation

The sharing of microdata, such as fund holdings and derivative instrumen...
research
03/14/2022

A review of Generative Adversarial Networks for Electronic Health Records: applications, evaluation measures and data sources

Electronic Health Records (EHRs) are a valuable asset to facilitate clin...

Please sign up or login with your details

Forgot password? Click here to reset