Benchmarking Modern Named Entity Recognition Techniques for Free-text Health Record De-identification

03/25/2021
by   Abdullah Ahmed, et al.
0

Electronic Health Records (EHRs) have become the primary form of medical data-keeping across the United States. Federal law restricts the sharing of any EHR data that contains protected health information (PHI). De-identification, the process of identifying and removing all PHI, is crucial for making EHR data publicly available for scientific research. This project explores several deep learning-based named entity recognition (NER) methods to determine which method(s) perform better on the de-identification task. We trained and tested our models on the i2b2 training dataset, and qualitatively assessed their performance using EHR data collected from a local hospital. We found that 1) BiLSTM-CRF represents the best-performing encoder/decoder combination, 2) character-embeddings and CRFs tend to improve precision at the price of recall, and 3) transformers alone under-perform as context encoders. Future work focused on structuring medical text may improve the extraction of semantic and syntactic information for the purposes of EHR de-identification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2021

Dutch Named Entity Recognition and De-identification Methods for the Human Resource Domain

The human resource (HR) domain contains various types of privacy-sensiti...
research
06/07/2018

Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility

Functioning is gaining recognition as an important indicator of global h...
research
05/24/2020

MASK: A flexible framework to facilitate de-identification of clinical texts

Medical health records and clinical summaries contain a vast amount of i...
research
01/01/2021

De-identifying Hospital Discharge Summaries: An End-to-End Framework using Ensemble of De-Identifiers

Objective:Electronic Medical Records (EMRs) contain clinical narrative t...
research
03/10/2019

Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches

This work investigates multiple approaches to Named Entity Recognition (...
research
10/15/2019

Comprehend Medical: a Named Entity Recognition and Relationship Extraction Web Service

Comprehend Medical is a stateless and Health Insurance Portability and A...
research
06/12/2019

Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records

De-identification is the task of detecting protected health information ...

Please sign up or login with your details

Forgot password? Click here to reset