DeepAI AI Chat
Log In Sign Up

BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining

05/26/2020
by   Zachariah Zhang, et al.
NYU Langone Medical Center LLC
NYU college
0

Clinical interactions are initially recorded and documented in free text medical notes. ICD coding is the task of classifying and coding all diagnoses, symptoms and procedures associated with a patient's visit. The process is often manual and extremely time-consuming and expensive for hospitals. In this paper, we propose a machine learning model, BERT-XML, for large scale automated ICD coding from EHR notes, utilizing recently developed unsupervised pretraining that have achieved state of the art performance on a variety of NLP tasks. We train a BERT model from scratch on EHR notes, learning with vocabulary better suited for EHR tasks and thus outperform off-the-shelf models. We adapt the BERT architecture for ICD coding with multi-label attention. While other works focus on small public medical datasets, we have produced the first large scale ICD-10 classification model using millions of EHR notes to predict thousands of unique ICD codes.

READ FULL TEXT
04/14/2021

Towards BERT-based Automatic ICD Coding: Limitations and Opportunities

Automatic ICD coding is the task of assigning codes from the Internation...
02/20/2020

Federated pretraining and fine tuning of BERT using clinical notes from multiple silos

Large scale contextual representation models, such as BERT, have signifi...
07/14/2022

GrabQC: Graph based Query Contextualization for automated ICD coding

Automated medical coding is a process of codifying clinical notes to app...
02/01/2018

Classifying medical notes into standard disease codes using Machine Learning

We investigate the automatic classification of patient discharge notes i...
05/02/2018

Automatic Coding for Neonatal Jaundice From Free Text Data Using Ensemble Methods

This study explores the creation of a machine learning model to automati...