From Extreme Multi-label to Multi-class: A Hierarchical Approach for Automated ICD-10 Coding Using Phrase-level Attention

02/18/2021
by   Cansu Sen, et al.
0

Clinical coding is the task of assigning a set of alphanumeric codes, referred to as ICD (International Classification of Diseases), to a medical event based on the context captured in a clinical narrative. The latest version of ICD, ICD-10, includes more than 70,000 codes. As this is a labor-intensive and error-prone task, automatic ICD coding of medical reports using machine learning has gained significant interest in the last decade. Existing literature has modeled this problem as a multi-label task. Nevertheless, such multi-label approach is challenging due to the extremely large label set size. Furthermore, the interpretability of the predictions is essential for the endusers (e.g., healthcare providers and insurance companies). In this paper, we propose a novel approach for automatic ICD coding by reformulating the extreme multi-label problem into a simpler multi-class problem using a hierarchical solution. We made this approach viable through extensive data collection to acquire phrase-level human coder annotations to supervise our models on learning the specific relations between the input text and predicted ICD codes. Our approach employs two independently trained networks, the sentence tagger and the ICD classifier, stacked hierarchically to predict a codeset for a medical report. The sentence tagger identifies focus sentences containing a medical event or concept relevant to an ICD coding. Using a supervised attention mechanism, the ICD classifier then assigns each focus sentence with an ICD code. The proposed approach outperforms strong baselines by large margins of 23 instance based F-1. With our proposed approach, interpretability is achieved not through implicitly learned attention scores but by attributing each prediction to a particular sentence and words selected by human coders.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2022

An Automatic ICD Coding Network Using Partition-Based Label Attention

International Classification of Diseases (ICD) is a global medical class...
research
07/13/2020

A Label Attention Model for ICD Coding from Clinical Text

ICD coding is a process of assigning the International Classification of...
research
04/08/2010

Ontology-supported processing of clinical text using medical knowledge integration for multi-label classification of diagnosis coding

This paper discusses the knowledge integration of clinical information e...
research
06/12/2021

A Pseudo Label-wise Attention Network for Automatic ICD Coding

Automatic International Classification of Diseases (ICD) coding is defin...
research
07/16/2023

SHAMSUL: Simultaneous Heatmap-Analysis to investigate Medical Significance Utilizing Local interpretability methods

The interpretability of deep neural networks has become a subject of gre...
research
10/05/2020

An Ensemble Approach to Automatic Structuring of Radiology Reports

Automatic structuring of electronic medical records is of high demand fo...
research
02/26/2021

A Meta-embedding-based Ensemble Approach for ICD Coding Prediction

International Classification of Diseases (ICD) are the de facto codes us...

Please sign up or login with your details

Forgot password? Click here to reset