A Multi-View Joint Learning Framework for Embedding Clinical Codes and Text Using Graph Neural Networks

01/27/2023
by   Lecheng Kong, et al.
0

Learning to represent free text is a core task in many clinical machine learning (ML) applications, as clinical text contains observations and plans not otherwise available for inference. State-of-the-art methods use large language models developed with immense computational resources and training data; however, applying these models is challenging because of the highly varying syntax and vocabulary in clinical free text. Structured information such as International Classification of Disease (ICD) codes often succinctly abstracts the most important facts of a clinical encounter and yields good performance, but is often not as available as clinical text in real-world scenarios. We propose a multi-view learning framework that jointly learns from codes and text to combine the availability and forward-looking nature of text and better performance of ICD codes. The learned text embeddings can be used as inputs to predictive algorithms independent of the ICD codes during inference. Our approach uses a Graph Neural Network (GNN) to process ICD codes, and Bi-LSTM to process text. We apply Deep Canonical Correlation Analysis (DCCA) to enforce the two views to learn a similar representation of each patient. In experiments using planned surgical procedure text, our model outperforms BERT models fine-tuned to clinical data, and in experiments using diverse text in MIMIC-III, our model is competitive to a fine-tuned BERT at a tiny fraction of its computational effort.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2021

Neural Language Models with Distant Supervision to Identify Major Depressive Disorder from Clinical Notes

Major depressive disorder (MDD) is a prevalent psychiatric disorder that...
research
08/24/2020

Prediction of ICD Codes with Clinical BERT Embeddings and Text Augmentation with Label Balancing using MIMIC-III

This paper achieves state of the art results for the ICD code prediction...
research
08/11/2023

Large Language Models to Identify Social Determinants of Health in Electronic Health Records

Social determinants of health (SDoH) have an important impact on patient...
research
11/20/2022

Artificial Interrogation for Attributing Language Models

This paper presents solutions to the Machine Learning Model Attribution ...
research
07/14/2022

GrabQC: Graph based Query Contextualization for automated ICD coding

Automated medical coding is a process of codifying clinical notes to app...
research
11/25/2019

ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network

Automated ICD coding, which assigns the International Classification of ...
research
05/08/2023

Autoencoder-based prediction of ICU clinical codes

Availability of diagnostic codes in Electronic Health Records (EHRs) is ...

Please sign up or login with your details

Forgot password? Click here to reset