Neural Language Model for Automated Classification of Electronic Medical Records at the Emergency Room. The Significant Benefit of Unsupervised Generative Pre-training

by   Binbin Xu, et al.

In the context of a project to build a national injury surveillance system based on emergency room (ER) visit reports, it was necessary to develop a coding system capable of classifying the causes of these visits based on the automatic reading of clinical notes written by clinicians. Supervised learning techniques have shown good results but require manual coding of a large number of texts for model training. New levels of performance have been achieved in neural language models (NLM) with the use of the Transformer architecture with an unsupervised generative pre-training step. Our hypothesis is that this latter method significantly reduces the number of annotated samples required. We derived from available diagnostic codes the traumatic/non-traumatic nature of the cause of the ER visit. We then designed a case study to predict from free text clinical notes whether a visit was traumatic or not. We compared two strategies: Strategy A consisted in training the GPT-2 NLM on the training data (with a maximum of 161930 samples) with all labels (trauma/non-trauma) in a single fully-supervised phase. In Strategy B, we split the training data in two parts, 151930 samples without any label for the self-supervised pre-training phase and a much smaller part (up to 10000 samples) for the supervised fine-tuning with labels. In Strategy A, AUC and F1 score reach 0.97 and 0.89 respectively after the processing of 40000 samples. The use of generative pre-training (Strategy B) achieved an AUC of 0.93 and an F1-score of 0.80 after the processing of only 120 samples. The same performance was achieved with only 30 labeled samples processed 3 times (3 epochs of learning). To conclude, it is possible to easily adapt a multi-purpose NLM model such as the GPT-2 to create a powerful tool for classification of free-text notes with the need of a very small number of labeled samples.


page 1

page 2

page 3

page 4


Unsupervised pre-training of graph transformers on patient population graphs

Pre-training has shown success in different areas of machine learning, s...

Frozen Language Model Helps ECG Zero-Shot Learning

The electrocardiogram (ECG) is one of the most commonly used non-invasiv...

Unsupervised Pre-Training on Patient Population Graphs for Patient-Level Predictions

Pre-training has shown success in different areas of machine learning, s...

Toward Automated Early Sepsis Alerting: Identifying Infection Patients from Nursing Notes

Severe sepsis and septic shock are conditions that affect millions of pa...

Self-Supervised Detection of Contextual Synonyms in a Multi-Class Setting: Phenotype Annotation Use Case

Contextualised word embeddings is a powerful tool to detect contextual s...

Autoencoders as Weight Initialization of Deep Classification Networks for Cancer versus Cancer Studies

Cancer is still one of the most devastating diseases of our time. One wa...

Please sign up or login with your details

Forgot password? Click here to reset