DeepAI AI Chat
Log In Sign Up

Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models

by   Oren Melamud, et al.

Large-scale clinical data is invaluable to driving many computational scientific advances today. However, understandable concerns regarding patient privacy hinder the open dissemination of such data and give rise to suboptimal siloed research. De-identification methods attempt to address these concerns but were shown to be susceptible to adversarial attacks. In this work, we focus on the vast amounts of unstructured natural language data stored in clinical notes and propose to automatically generate synthetic clinical notes that are more amenable to sharing using generative models trained on real de-identified records. To evaluate the merit of such notes, we measure both their privacy preservation properties as well as utility in training clinical NLP models. Experiments using neural language models yield notes whose utility is close to that of the real ones in some clinical NLP tasks, yet leave ample room for future improvements.


Towards the Creation of a Large Corpus of Synthetically-Identified Clinical Notes

Clinical notes often describe the most important aspects of a patient's ...

Do We Still Need Clinical Language Models?

Although recent advances in scaling large language models (LLMs) have re...

Performance of Automatic De-identification Across Different Note Types

Free-text clinical notes detail all aspects of patient care and have gre...

Training Models to Extract Treatment Plans from Clinical Notes Using Contents of Sections with Headings

Objective: Using natural language processing (NLP) to find sentences tha...

The NLP Sandbox: an efficient model-to-data system to enable federated and unbiased evaluation of clinical NLP models

Objective The evaluation of natural language processing (NLP) models for...

Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective

This paper presents a Lisp architecture for a portable NLP system, terme...