Lung Cancer Risk Estimation with Incomplete Data: A Joint Missing Imputation Perspective

by   Riqiang Gao, et al.
Vanderbilt University

Data from multi-modality provide complementary information in clinical prediction, but missing data in clinical cohorts limits the number of subjects in multi-modal learning context. Multi-modal missing imputation is challenging with existing methods when 1) the missing data span across heterogeneous modalities (e.g., image vs. non-image); or 2) one modality is largely missing. In this paper, we address imputation of missing data by modeling the joint distribution of multi-modal data. Motivated by partial bidirectional generative adversarial net (PBiGAN), we propose a new Conditional PBiGAN (C-PBiGAN) method that imputes one modality combining the conditional knowledge from another modality. Specifically, C-PBiGAN introduces a conditional latent space in a missing imputation framework that jointly encodes the available multi-modal data, along with a class regularization loss on imputed data to recover discriminative information. To our knowledge, it is the first generative adversarial model that addresses multi-modal missing imputation by modeling the joint distribution of image and non-image data. We validate our model with both the national lung screening trial (NLST) dataset and an external clinical validation cohort. The proposed C-PBiGAN achieves significant improvements in lung cancer risk estimation compared with representative imputation methods (e.g., AUC values increase in both NLST (+2.9%) and in-house dataset (+4.3%) compared with PBiGAN, p<0.05).


Unified Multi-Modal Image Synthesis for Missing Modality Imputation

Multi-modal medical images provide complementary soft-tissue characteris...

Deep Multi-path Network Integrating Incomplete Biomarker and Chest CT Data for Evaluating Lung Cancer Risk

Clinical data elements (CDEs) (e.g., age, smoking history), blood marker...

Survival Analysis for Idiopathic Pulmonary Fibrosis using CT Images and Incomplete Clinical Data

Idiopathic Pulmonary Fibrosis (IPF) is an inexorably progressive fibroti...

Joint data imputation and mechanistic modelling for simulating heart-brain interactions in incomplete datasets

The use of mechanistic models in clinical studies is limited by the lack...

Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data

Understanding of the pathophysiology of obstructive lung disease (OLD) i...

VIGAN: Missing View Imputation with Generative Adversarial Networks

In an era when big data are becoming the norm, there is less concern wit...

Clustering-Induced Generative Incomplete Image-Text Clustering (CIGIT-C)

The target of image-text clustering (ITC) is to find correct clusters by...

Please sign up or login with your details

Forgot password? Click here to reset