Inference of Polygenic Factors Associated with Breast Cancer Gene Interaction Networks from Discrete Data Utilizing Poisson Multivariate Mutual Information

02/06/2020
by   Jeremie Fish, et al.
0

In this work we introduce a new methodology to infer from gene expression data the complex interactions associated with polygenetic diseases that remain a major frontier in understanding factors in human health. In many cases disease may be related to the covariance of several genes, rather than simply the variance of a single gene, making network inference crucial to the development of potential treatments. Specifically we investigate the network of factors and associations involved in developing breast cancer from gene expression data. Our approach is information theoretic, but a major obstacle has been the discrete nature of such data that is well described as a multi-variate Poisson process. In fact despite that mutual information is generally a well regarded approach for developing networks of association in data science of complex systems across many disciplines, until now a good method to accurately and efficiently compute entropies from such processes as been lacking. Nonparameteric methods such as the popular k-nearest neighbors (KNN) methods are slow converging and thus require unrealistic amounts of data. We will use the causation entropy (CSE) principle, together with the associated greedy search algorithm optimal CSE (oCSE) as a network inference method to deduce the actual structure, with our multi-variate Poisson estimator developed here as the core computational engine. We show that the Poisson version of oCSE outperforms both the Kraskov-Stögbauer-Grassberger (KSG) oCSE method (which is a KNN method for estimating the entropy) and the Gaussian oCSE method on synthetic data. We present the results for a breast cancer gene expression data set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2020

Identification of deregulated transcription factors involved in subtypes of cancers

We propose a methodology for the identification of transcription factors...
research
11/21/2008

Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks

We present a procedure for effective estimation of entropy and mutual in...
research
06/16/2018

CAMIRADA: Cancer microRNA association discovery algorithm, a case study on breast cancer

In recent studies, non-coding protein RNAs have been identified as micro...
research
05/04/2018

Mixture Envelope Model for Heterogeneous Genomics Data Analysis

Envelope model also known as multivariate regression model was proposed ...
research
03/17/2023

Breast Cancer Histopathology Image based Gene Expression Prediction using Spatial Transcriptomics data and Deep Learning

Tumour heterogeneity in breast cancer poses challenges in predicting out...
research
09/18/2020

Predicting molecular phenotypes from histopathology images: a transcriptome-wide expression-morphology analysis in breast cancer

Molecular phenotyping is central in cancer precision medicine, but remai...
research
06/17/2015

Detection of Epigenomic Network Community Oncomarkers

In this paper we propose network methodology to infer prognostic cancer ...

Please sign up or login with your details

Forgot password? Click here to reset