BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining

05/26/2020
by   Zachariah Zhang, et al.
0

Clinical interactions are initially recorded and documented in free text medical notes. ICD coding is the task of classifying and coding all diagnoses, symptoms and procedures associated with a patient's visit. The process is often manual and extremely time-consuming and expensive for hospitals. In this paper, we propose a machine learning model, BERT-XML, for large scale automated ICD coding from EHR notes, utilizing recently developed unsupervised pretraining that have achieved state of the art performance on a variety of NLP tasks. We train a BERT model from scratch on EHR notes, learning with vocabulary better suited for EHR tasks and thus outperform off-the-shelf models. We adapt the BERT architecture for ICD coding with multi-label attention. While other works focus on small public medical datasets, we have produced the first large scale ICD-10 classification model using millions of EHR notes to predict thousands of unique ICD codes.

READ FULL TEXT
research
03/17/2020

Multi-label natural language processing to identify diagnosis and procedure codes from MIMIC-III inpatient notes

In the United States, 25 spending accounts for administrative costs that...
research
04/14/2021

Towards BERT-based Automatic ICD Coding: Limitations and Opportunities

Automatic ICD coding is the task of assigning codes from the Internation...
research
02/20/2020

Federated pretraining and fine tuning of BERT using clinical notes from multiple silos

Large scale contextual representation models, such as BERT, have signifi...
research
07/14/2022

GrabQC: Graph based Query Contextualization for automated ICD coding

Automated medical coding is a process of codifying clinical notes to app...
research
06/03/2021

BERT meets LIWC: Exploring State-of-the-Art Language Models for Predicting Communication Behavior in Couples' Conflict Interactions

Many processes in psychology are complex, such as dyadic interactions be...
research
02/01/2018

Classifying medical notes into standard disease codes using Machine Learning

We investigate the automatic classification of patient discharge notes i...
research
05/02/2018

Automatic Coding for Neonatal Jaundice From Free Text Data Using Ensemble Methods

This study explores the creation of a machine learning model to automati...

Please sign up or login with your details

Forgot password? Click here to reset