Estimation of large block covariance matrices: Application to the analysis of gene expression data

06/26/2018
by   Marie Perrot-Dockès, et al.
0

Motivated by an application in molecular biology, we propose a novel, efficient and fully data-driven approach for estimating large block structured sparse covariance matrices in the case where the number of variables is much larger than the number of samples without limiting ourselves to block diagonal matrices. Our approach consists in approximating such a covariance matrix by the sum of a low-rank sparse matrix and a diagonal matrix. Our methodology can also deal with matrices for which the block structure only appears if the columns and rows are permuted according to an unknown permutation. Our technique is implemented in the R package BlockCov which is available from the Comprehensive R Archive Network and from GitHub. In order to illustrate the statistical and numerical performance of our package some numerical experiments are provided as well as a thorough comparison with alternative methods. Finally, our approach is applied to gene expression data in order to better understand the toxicity of acetaminophen on the liver of rats.

READ FULL TEXT

page 9

page 17

research
07/30/2019

Block-diagonal covariance estimation and application to the Shapley effects in sensitivity analysis

In this paper, we aim to estimate block-diagonal covariance matrices for...
research
03/27/2015

Estimating a common covariance matrix for network meta-analysis of gene expression datasets in diffuse large B-cell lymphoma

The estimation of covariance matrices of gene expressions has many appli...
research
12/04/2020

A Canonical Representation of Block Matrices with Applications to Covariance and Correlation Matrices

We obtain a canonical representation for block matrices. The representat...
research
01/02/2016

Joint Estimation of Precision Matrices in Heterogeneous Populations

We introduce a general framework for estimation of inverse covariance, o...
research
11/28/2018

Beyond Pham's algorithm for joint diagonalization

The approximate joint diagonalization of a set of matrices consists in f...
research
03/29/2020

The covariance shift (C-SHIFT) algorithm for normalizing biological data

Omics technologies are powerful tools for analyzing patterns in gene exp...
research
07/27/2013

Kronecker Sum Decompositions of Space-Time Data

In this paper we consider the use of the space vs. time Kronecker produc...

Please sign up or login with your details

Forgot password? Click here to reset