Unsupervised Pre-Training on Patient Population Graphs for Patient-Level Predictions

03/23/2022
by   Chantal Pellegrini, et al.
12

Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging. However, it has not been fully explored for clinical data analysis. Even though an immense amount of Electronic Health Record (EHR) data is recorded, data and labels can be scarce if the data is collected in small hospitals or deals with rare diseases. In such scenarios, pre-training on a larger set of EHR data could improve the model performance. In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction. To model this data, we leverage graph deep learning over population graphs. We first design a network architecture based on graph transformer designed to handle various input feature types occurring in EHR data, like continuous, discrete, and time-series features, allowing better multi-modal data fusion. Further, we design pre-training methods based on masked imputation to pre-train our network before fine-tuning on different end tasks. Pre-training is done in a fully unsupervised fashion, which lays the groundwork for pre-training on large public datasets with different tasks and similar modalities in the future. We test our method on two medical datasets of patient records, TADPOLE and MIMIC-III, including imaging and non-imaging features and different prediction tasks. We find that our proposed graph based pre-training method helps in modeling the data at a population level and further improves performance on the fine tuning tasks in terms of AUC on average by 4.15 MIMIC and 7.64

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2022

Unsupervised pre-training of graph transformers on patient population graphs

Pre-training has shown success in different areas of machine learning, s...
research
06/24/2021

Pre-training transformer-based framework on large-scale pediatric claims data for downstream population-specific tasks

The adoption of electronic health records (EHR) has become universal dur...
research
09/21/2021

Survey: Transformer based Video-Language Pre-training

Inspired by the success of transformer-based pre-training methods on nat...
research
08/31/2021

Medical SANSformers: Training self-supervised transformers without attention for Electronic Medical Records

We leverage deep sequential models to tackle the problem of predicting h...
research
12/22/2021

Fusion of medical imaging and electronic health records with attention and multi-head machanisms

Doctors often make diagonostic decisions based on patient's image scans,...
research
05/02/2023

Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner

Language models (LMs) trained on vast quantities of unlabelled data have...

Please sign up or login with your details

Forgot password? Click here to reset