Towards the Creation of a Large Corpus of Synthetically-Identified Clinical Notes

03/07/2018
by   Willie Boag, et al.
0

Clinical notes often describe the most important aspects of a patient's physiology and are therefore critical to medical research. However, these notes are typically inaccessible to researchers without prior removal of sensitive protected health information (PHI), a natural language processing (NLP) task referred to as deidentification. Tools to automatically de-identify clinical notes are needed but are difficult to create without access to those very same notes containing PHI. This work presents a first step toward creating a large synthetically-identified corpus of clinical notes and corresponding PHI annotations in order to facilitate the development de-identification tools. Further, one such tool is evaluated against this corpus in order to understand the advantages and shortcomings of this approach.

READ FULL TEXT
research
03/24/2022

Classifying Cyber-Risky Clinical Notes by Employing Natural Language Processing

Clinical notes, which can be embedded into electronic medical records, d...
research
02/17/2021

Performance of Automatic De-identification Across Different Note Types

Free-text clinical notes detail all aspects of patient care and have gre...
research
02/02/2020

Assessment of Amazon Comprehend Medical: Medication Information Extraction

In November 27, 2018, Amazon Web Services (AWS) released Amazon Comprehe...
research
04/15/2021

Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?

Large Transformers pretrained over clinical notes from Electronic Health...
research
03/12/2020

The Medical Scribe: Corpus Development and Model Performance Analyses

There is a growing interest in creating tools to assist in clinical note...
research
07/02/2020

NLNDE: The Neither-Language-Nor-Domain-Experts' Way of Spanish Medical Document De-Identification

Natural language processing has huge potential in the medical domain whi...
research
10/03/2018

A Deep Learning Architecture for De-identification of Patient Notes: Implementation and Evaluation

De-identification is the process of removing 18 protected health informa...

Please sign up or login with your details

Forgot password? Click here to reset