Classifying Unstructured Clinical Notes via Automatic Weak Supervision

06/24/2022
by   Chufan Gao, et al.
0

Healthcare providers usually record detailed notes of the clinical care delivered to each patient for clinical, research, and billing purposes. Due to the unstructured nature of these narratives, providers employ dedicated staff to assign diagnostic codes to patients' diagnoses using the International Classification of Diseases (ICD) coding system. This manual process is not only time-consuming but also costly and error-prone. Prior work demonstrated potential utility of Machine Learning (ML) methodology in automating this process, but it has relied on large quantities of manually labeled data to train the models. Additionally, diagnostic coding systems evolve with time, which makes traditional supervised learning strategies unable to generalize beyond local applications. In this work, we introduce a general weakly-supervised text classification framework that learns from class-label descriptions only, without the need to use any human-labeled documents. It leverages the linguistic domain knowledge stored within pre-trained language models and the data programming framework to assign code labels to individual texts. We demonstrate the efficacy and flexibility of our method by comparing it to state-of-the-art weak text classifiers across four real-world text classification datasets, in addition to assigning ICD codes to medical notes in the publicly available MIMIC-III database.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2020

Multi-label natural language processing to identify diagnosis and procedure codes from MIMIC-III inpatient notes

In the United States, 25 spending accounts for administrative costs that...
research
06/18/2022

Weakly Supervised Classification of Vital Sign Alerts as Real or Artifact

A significant proportion of clinical physiologic monitoring alarms are f...
research
02/26/2021

A Meta-embedding-based Ensemble Approach for ICD Coding Prediction

International Classification of Diseases (ICD) are the de facto codes us...
research
05/11/2022

Ontology-Based and Weakly Supervised Rare Disease Phenotyping from Clinical Notes

Computational text phenotyping is the practice of identifying patients w...
research
11/11/2017

Towards Automated ICD Coding Using Deep Learning

International Classification of Diseases(ICD) is an authoritative health...
research
05/20/2022

Semi-self-supervised Automated ICD Coding

Clinical Text Notes (CTNs) contain physicians' reasoning process, writte...
research
04/21/2022

ICDBigBird: A Contextual Embedding Model for ICD Code Classification

The International Classification of Diseases (ICD) system is the interna...

Please sign up or login with your details

Forgot password? Click here to reset