Graph-Text Multi-Modal Pre-training for Medical Representation Learning

03/18/2022
by   Sungjin Park, et al.
0

As the volume of Electronic Health Records (EHR) sharply grows, there has been emerging interest in learning the representation of EHR for healthcare applications. Representation learning of EHR requires appropriate modeling of the two dominant modalities in EHR: structured data and unstructured text. In this paper, we present MedGTX, a pre-trained model for multi-modal representation learning of the structured and textual EHR data. MedGTX uses a novel graph encoder to exploit the graphical nature of structured EHR data, and a text encoder to handle unstructured text, and a cross-modal encoder to learn a joint representation space. We pre-train our model through four proxy tasks on MIMIC-III, an open-source EHR data, and evaluate our model on two clinical benchmarks and three novel downstream tasks which tackle real-world problems in EHR data. The results consistently show the effectiveness of pre-training the model for joint representation of both structured and unstructured information from EHR. Given the promising performance of MedGTX, we believe this work opens a new door to jointly understanding the two fundamental modalities of EHR data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2022

Vision-Language Pre-Training for Boosting Scene Text Detectors

Recently, vision-language joint representation learning has proven to be...
research
03/30/2023

Medical Intervention Duration Estimation Using Language-enhanced Transformer Encoder with Medical Prompts

In recent years, estimating the duration of medical intervention based o...
research
09/04/2021

Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Self-supervised learning provides an opportunity to explore unlabeled ch...
research
03/01/2022

Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment

Vision-and-Language (V+L) pre-training models have achieved tremendous s...
research
05/20/2021

VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding

We present a simplified, task-agnostic multi-modal pre-training approach...
research
11/27/2022

Topic Segmentation in the Wild: Towards Segmentation of Semi-structured Unstructured Chats

Breaking down a document or a conversation into multiple contiguous segm...
research
03/09/2023

TQ-Net: Mixed Contrastive Representation Learning For Heterogeneous Test Questions

Recently, more and more people study online for the convenience of acces...

Please sign up or login with your details

Forgot password? Click here to reset