Improved Patient Classification with Language Model Pretraining Over Clinical Notes

09/06/2019
by   Jonas Kemp, et al.
4

Clinical notes in electronic health records contain highly heterogeneous writing styles, including non-standard terminology or abbreviations. Using these notes in predictive modeling has traditionally required preprocessing (e.g. taking frequent terms or topic modeling) that removes much of the richness of the source data. We propose a pretrained hierarchical recurrent neural network model that parses minimally processed clinical notes in an intuitive fashion, and show that it improves performance for multiple classification tasks on the Medical Information Mart for Intensive Care III (MIMIC-III) dataset, increasing top-5 recall to 89.7 diagnosis classification and AUPRC to 35.2 diagnosis classification compared to models that treat the notes as an unordered collection of terms or without pretraining. We also apply an attribution technique to several examples to identify the words and the nearby context that the model uses to make its prediction, and show the importance of the words' context.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2018

Learning to Write Notes in Electronic Health Records

Clinicians spend a significant amount of time inputting free-form textua...
research
12/06/2017

Convolutional Neural Networks for Medical Diagnosis from Admission Notes

Objective Develop an automatic diagnostic system which only uses textual...
research
01/18/2022

Label-dependent and event-guided interpretable disease risk prediction using EHRs

Electronic health records (EHRs) contain patients' heterogeneous data th...
research
04/17/2021

Hierarchical Transformer Networks for Longitudinal Clinical Document Classification

We present the Hierarchical Transformer Networks for modeling long-term ...
research
01/18/2022

Label Dependent Attention Model for Disease Risk Prediction Using Multimodal Electronic Health Records

Disease risk prediction has attracted increasing attention in the field ...
research
04/15/2021

Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?

Large Transformers pretrained over clinical notes from Electronic Health...
research
07/01/2023

Hierarchical Pretraining for Biomedical Term Embeddings

Electronic health records (EHR) contain narrative notes that provide ext...

Please sign up or login with your details

Forgot password? Click here to reset