MDACE: MIMIC Documents Annotated with Code Evidence

07/07/2023
by   Hua Cheng, et al.
0

We introduce a dataset for evidence/rationale extraction on an extreme multi-label classification task over long medical documents. One such task is Computer-Assisted Coding (CAC) which has improved significantly in recent years, thanks to advances in machine learning technologies. Yet simply predicting a set of final codes for a patient encounter is insufficient as CAC systems are required to provide supporting textual evidence to justify the billing codes. A model able to produce accurate and reliable supporting evidence for each code would be a tremendous benefit. However, a human annotated code evidence corpus is extremely difficult to create because it requires specialized knowledge. In this paper, we introduce MDACE, the first publicly available code evidence dataset, which is built on a subset of the MIMIC-III clinical records. The dataset – annotated by professional medical coders – consists of 302 Inpatient charts with 3,934 evidence spans and 52 Profee charts with 5,563 evidence spans. We implemented several evidence extraction methods based on the EffectiveCAN model (Liu et al., 2021) to establish baseline performance on this dataset. MDACE can be used to evaluate code evidence extraction methods for CAC systems, as well as the accuracy and interpretability of deep learning models for multi-label classification. We believe that the release of MDACE will greatly improve the understanding and application of deep learning technologies for medical coding and document classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2022

Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt

Automatic International Classification of Diseases (ICD) coding aims to ...
research
04/21/2023

Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study

Medical coding is the task of assigning medical codes to clinical free-t...
research
02/07/2018

An Empirical Evaluation of Deep Learning for ICD-9 Code Assignment using MIMIC-III Clinical Notes

Code assignment is important on many levels in the modern hospital, from...
research
11/05/2018

Medical code prediction with multi-view convolution and description-regularized label-dependent attention

A ubiquitous task in processing electronic medical data is the assignmen...
research
06/24/2021

Modeling Diagnostic Label Correlation for Automatic ICD Coding

Given the clinical notes written in electronic health records (EHRs), it...
research
04/02/2021

Multitask Recalibrated Aggregation Network for Medical Code Prediction

Medical coding translates professionally written medical reports into st...
research
09/06/2021

Multi-task Balanced and Recalibrated Network for Medical Code Prediction

Human coders assign standardized medical codes to clinical documents gen...

Please sign up or login with your details

Forgot password? Click here to reset