Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding

10/07/2022
by   Zhichao Yang, et al.
0

Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with average length of 3,000+ tokens. This task is challenging due to a high-dimensional space of multi-label assignment (tens of thousands of ICD codes) and the long-tail challenge: only a few codes (common diseases) are frequently assigned while most codes (rare diseases) are infrequently assigned. This study addresses the long-tail challenge by adapting a prompt-based fine-tuning technique with label semantics, which has been shown to be effective under few-shot setting. To further enhance the performance in medical domain, we propose a knowledge-enhanced longformer by injecting three domain-specific knowledge: hierarchy, synonym, and abbreviation with additional pretraining using contrastive learning. Experiments on MIMIC-III-full, a benchmark dataset of code assignment, show that our proposed method outperforms previous state-of-the-art method in 14.5 further test our model on few-shot setting, we created a new rare diseases coding dataset, MIMIC-III-rare50, on which our model improves marco F1 from 17.1 to 30.4 and micro F1 from 17.2 to 32.6 compared to previous method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2022

Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt

Automatic International Classification of Diseases (ICD) coding aims to ...
research
03/03/2022

Code Synonyms Do Matter: Multiple Synonyms Matching Network for Automatic ICD Coding

Automatic ICD coding is defined as assigning disease codes to electronic...
research
06/29/2021

Few-Shot Electronic Health Record Coding through Graph Contrastive Learning

Electronic health record (EHR) coding is the task of assigning ICD codes...
research
05/22/2023

Copy Recurrent Neural Network Structure Network

Electronic Health Record (EHR) coding involves automatically classifying...
research
12/09/2022

HieNet: Bidirectional Hierarchy Framework for Automated ICD Coding

International Classification of Diseases (ICD) is a set of classificatio...
research
12/12/2022

Automated ICD Coding using Extreme Multi-label Long Text Transformer-based Models

Background: Encouraged by the success of pretrained Transformer models i...
research
09/28/2019

Generalized Zero-shot ICD Coding

The International Classification of Diseases (ICD) is a list of classifi...

Please sign up or login with your details

Forgot password? Click here to reset