The scalable Birth-Death MCMC Algorithm for Mixed Graphical Model Learning with Application to Genomic Data Integration

05/08/2020
by   Nanwei Wang, et al.
0

Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are available. In cancer research, the challenge is now to perform integrative analyses of high-dimensional multi-omic data with the goal to better understand genomic processes that correlate with cancer outcomes, e.g. elucidate gene networks that discriminate a specific cancer subgroups (cancer sub-typing) or discovering gene networks that overlap across different cancer types (pan-cancer studies). In this paper, we propose a novel mixed graphical model approach to analyze multi-omic data of different types (continuous, discrete and count) and perform model selection by extending the Birth-Death MCMC (BDMCMC) algorithm initially proposed by <cit.> and later developed by <cit.>. We compare the performance of our method to the LASSO method and the standard BDMCMC method using simulations and find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results. Finally, an application to the TCGA breast cancer data shows that integrating genomic information at different levels (mutation and expression data) leads to better subtyping of breast cancers.

READ FULL TEXT
research
06/26/2018

Bayesian Multi-study Factor Analysis for High-throughput Biological Data

This paper presents a new modeling strategy for joint unsupervised analy...
research
02/14/2020

Biological Random Walks: integrating heterogeneous data in disease gene prioritization

This work proposes a unified framework to leverage biological informatio...
research
10/22/2018

Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data

Precision medicine aims for personalized prognosis and therapeutics by u...
research
10/10/2017

An Extension of Deep Pathway Analysis: A Pathway Route Analysis Framework Incorporating Multi-dimensional Cancer Genomics Data

Recent breakthroughs in cancer research have come via the up-and-coming ...
research
08/15/2017

Sparse Inverse Covariance Estimation for High-throughput microRNA Sequencing Data in the Poisson Log-Normal Graphical Model

We introduce the Poisson Log-Normal Graphical Model for count data, and ...
research
12/30/2022

Topical Hidden Genome: Discovering Latent Cancer Mutational Topics using a Bayesian Multilevel Context-learning Approach

Statistical inference on the cancer-site specificities of collective ult...
research
05/05/2020

A Pipeline for Integrated Theory and Data-Driven Modeling of Genomic and Clinical Data

High throughput genome sequencing technologies such as RNA-Seq and Micro...

Please sign up or login with your details

Forgot password? Click here to reset