Natural Language Generation for Electronic Health Records

06/01/2018
by   Scott Lee, et al.
0

A variety of methods existing for generating synthetic electronic health records (EHRs), but they are not capable of generating unstructured text, like emergency department (ED) chief complaints, history of present illness or progress notes. Here, we use the encoder-decoder model, a deep learning algorithm that features in many contemporary machine translation systems, to generate synthetic chief complaints from discrete variables in EHRs, like age group, gender, and discharge diagnosis. After being trained end-to-end on authentic records, the model can generate realistic chief complaint text that preserves much of the epidemiological information in the original data. As a side effect of the model's optimization goal, these synthetic chief complaints are also free of relatively uncommon abbreviation and misspellings, and they include none of the personally-identifiable information (PII) that was in the training data, suggesting it may be used to support the de-identification of text in EHRs. When combined with algorithms like generative adversarial networks (GANs), our model could be used to generate fully-synthetic EHRs, facilitating data sharing between healthcare providers and researchers and improving our ability to develop machine learning methods tailored to the information in healthcare data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2023

Synthesizing Mixed-type Electronic Health Records using Diffusion Models

Electronic Health Records (EHRs) contain sensitive patient information, ...
research
04/09/2023

Distributed Conditional GAN (discGAN) For Synthetic Healthcare Data Generation

In this paper, we propose a distributed Generative Adversarial Networks ...
research
03/14/2022

A review of Generative Adversarial Networks for Electronic Health Records: applications, evaluation measures and data sources

Electronic Health Records (EHRs) are a valuable asset to facilitate clin...
research
04/30/2023

Sensitive Data Detection with High-Throughput Machine Learning Models in Electrical Health Records

In the era of big data, there is an increasing need for healthcare provi...
research
09/06/2017

Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records

The rapid growth of Electronic Health Records (EHRs), as well as the acc...
research
02/08/2023

MedDiff: Generating Electronic Health Records using Accelerated Denoising Diffusion Model

Due to patient privacy protection concerns, machine learning research in...
research
04/21/2018

Learning from the experts: From expert systems to machine learned diagnosis models

Expert diagnostic support systems have been extensively studied. The pra...

Please sign up or login with your details

Forgot password? Click here to reset