Generating Synthetic Mixed-type Longitudinal Electronic Health Records for Artificial Intelligent Applications

12/22/2021
by   Jin Li, et al.
0

The recent availability of electronic health records (EHRs) have provided enormous opportunities to develop artificial intelligence (AI) algorithms. However, patient privacy has become a major concern that limits data sharing across hospital settings and subsequently hinders the advances in AI. Synthetic data, which benefits from the development and proliferation of generative models, has served as a promising substitute for real patient EHR data. However, the current generative models are limited as they only generate single type of clinical data, i.e., either continuous-valued or discrete-valued. In this paper, we propose a generative adversarial network (GAN) entitled EHR-M-GAN which synthesizes mixed-type timeseries EHR data. EHR-M-GAN is capable of capturing the multidimensional, heterogeneous, and correlated temporal dynamics in patient trajectories. We have validated EHR-M-GAN on three publicly-available intensive care unit databases with records from a total of 141,488 unique patients, and performed privacy risk evaluation of the proposed model. EHR-M-GAN has demonstrated its superiority in performance over state-of-the-art benchmarks for synthesizing clinical timeseries with high fidelity. Notably, prediction models for outcomes of intensive care performed significantly better when training data was augmented with the addition of EHR-M-GAN-generated timeseries. EHR-M-GAN may have use in developing AI algorithms in resource-limited settings, lowering the barrier for data acquisition while preserving patient privacy.

READ FULL TEXT
research
02/28/2023

Synthesizing Mixed-type Electronic Health Records using Diffusion Models

Electronic Health Records (EHRs) contain sensitive patient information, ...
research
09/06/2021

Generation of Synthetic Electronic Health Records Using a Federated GAN

Sensitive medical data is often subject to strict usage constraints. In ...
research
05/25/2023

Ensemble Synthetic EHR Generation for Increasing Subpopulation Model's Performance

Electronic health records (EHR) often contain different rates of represe...
research
06/01/2018

Visualizing Patient Timelines in the Intensive Care Unit

Electronic Health Records (EHRs) contain a large volume of heterogeneous...
research
06/03/2023

ACI-BENCH: a Novel Ambient Clinical Intelligence Dataset for Benchmarking Automatic Visit Note Generation

Recent immense breakthroughs in generative models such as in GPT4 have p...
research
03/22/2023

Synthetic Health-related Longitudinal Data with Mixed-type Variables Generated using Diffusion Models

This paper presents a novel approach to simulating electronic health rec...
research
05/03/2021

Synthesizing time-series wound prognosis factors from electronic medical records using generative adversarial networks

Wound prognostic models not only provide an estimate of wound healing ti...

Please sign up or login with your details

Forgot password? Click here to reset