Network Analysis of Count Data from Mixed Populations

12/07/2022
by   Junjie Tang, et al.
0

In applications such as gene regulatory network analysis based on single-cell RNA sequencing data, samples often come from a mixture of different populations and each population has its own unique network. Available graphical models often assume that all samples are from the same population and share the same network. One has to first cluster the samples and use available methods to infer the network for every cluster separately. However, this two-step procedure ignores uncertainty in the clustering step and thus could lead to inaccurate network estimation. Motivated by these applications, we consider the mixture Poisson log-normal model for network inference of count data from mixed populations. The latent precision matrices of the mixture model correspond to the networks of different populations and can be jointly estimated by maximizing the lasso-penalized log-likelihood. Under rather mild conditions, we show that the mixture Poisson log-normal model is identifiable and has the positive definite Fisher information matrix. Consistency of the maximum lasso-penalized log-likelihood estimator is also established. To avoid the intractable optimization of the log-likelihood, we develop an algorithm called VMPLN based on the variational inference method. Comprehensive simulation and real single-cell RNA sequencing data analyses demonstrate the superior performance of VMPLN.

READ FULL TEXT

page 10

page 11

page 12

research
05/23/2022

Single-cell gene regulatory network analysis for mixed cell populations with applications to COVID-19 single cell data

Gene regulatory network (GRN) refers to the complex network formed by re...
research
11/07/2021

Gene regulatory network in single cells based on the Poisson log-normal model

Gene regulatory network inference is crucial for understanding the compl...
research
07/19/2013

The Cluster Graphical Lasso for improved estimation of Gaussian graphical models

We consider the task of estimating a Gaussian graphical model in the hig...
research
04/17/2020

A Mean Field Games model for finite mixtures of Bernoulli distributions

Finite mixture models are an important tool in the statistical analysis ...
research
05/07/2018

Learning Gene Regulatory Networks with High-Dimensional Heterogeneous Data

The Gaussian graphical model is a widely used tool for learning gene reg...
research
11/30/2017

A Multivariate Poisson-Log Normal Mixture Model for Clustering Transcriptome Sequencing Data

High-dimensional data of discrete and skewed nature is commonly encounte...
research
08/10/2020

Exact log-likelihood for clustering parameterised models and normally distributed data

Taking a model with equal means in each cluster, the log-likelihood for ...

Please sign up or login with your details

Forgot password? Click here to reset