A Hierarchical Spike-and-Slab Model for Pan-Cancer Survival Using Pan-Omic Data

02/28/2021
by   Sarah Samorodnitsky, et al.
0

Pan-omics, pan-cancer analysis has advanced our understanding of the molecular heterogeneity of cancer, expanding what was known from single-cancer or single-omics studies. However, pan-cancer, pan-omics analyses have been limited in their ability to use information from multiple sources of data (e.g., omics platforms) and multiple sample sets (e.g., cancer types) to predict important clinical outcomes, like overall survival. We address the issue of prediction across multiple high-dimensional sources of data and multiple sample sets by using exploratory results from BIDIFAC+, a method for integrative dimension reduction of bidimensionally-linked matrices, in a predictive model. We apply a Bayesian hierarchical model that performs variable selection using spike-and-slab priors which are modified to allow for the borrowing of information across clustered data. This method is used to predict overall patient survival from the Cancer Genome Atlas (TCGA) using data from 29 cancer types and 4 omics sources. Our model selected patterns of variation identified by BIDIFAC+ that differentiate clinical tumor subtypes with markedly different survival outcomes. We also use simulations to evaluate the performance of the modified spike-and-slab prior in terms of its variable selection accuracy and prediction accuracy under different underlying data-generating frameworks. Software and code used for our analysis can be found at https://github.com/sarahsamorodnitsky/HierarchicalSS_PanCanPanOmics/ .

READ FULL TEXT
research
12/08/2017

Bayesian Variable Selection For Survival Data Using Inverse Moment Priors

Efficient variable selection in high dimensional cancer genomic studies ...
research
08/30/2018

Gaussian process regression for survival time prediction with genome-wide gene expression

Predicting the survival time of a cancer patient based on his/her genome...
research
04/10/2017

Multi-Kernel LS-SVM Based Bio-Clinical Data Integration: Applications to Ovarian Cancer

The medical research facilitates to acquire a diverse type of data from ...
research
12/29/2022

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data

Rapid advancements in collection and dissemination of multi-platform mol...
research
09/08/2022

BatMan: Mitigating Batch Effects via Stratification for Survival Outcome Prediction

Reproducible translation of transcriptomics data has been hampered by th...
research
04/16/2020

Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression

Important objectives in cancer research are the prediction of a patient'...
research
02/07/2020

Bidimensional linked matrix factorization for pan-omics pan-cancer analysis

Several modern applications require the integration of multiple large da...

Please sign up or login with your details

Forgot password? Click here to reset