Bayesian Gamma-Negative Binomial Modeling of Single-Cell RNA Sequencing Data

08/01/2019
by   Siamak Zamani Dadaneh, et al.
0

Background: Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to better understand development and disease mechanisms. The unique analytic challenge is to appropriately model highly over-dispersed scRNA-seq count data with prevalent dropouts (zero counts), making zero-inflated dimensionality reduction techniques popular for scRNA-seq data analyses. Employing zero-inflated distributions, however, may place extra emphasis on zero counts, leading to potential bias when identifying the latent structure of the data. Results: In this paper, we propose a fully generative hierarchical gamma-negative binomial (hGNB) model of scRNA-seq data, obviating the need for explicitly modeling zero inflation. At the same time, hGNB can naturally account for covariate effects at both the gene and cell levels to identify complex latent representations of scRNA-seq data, without the need for commonly adopted pre-processing steps such as normalization. Efficient Bayesian model inference is derived by exploiting conditional conjugacy via novel data augmentation techniques. Conclusion: Experimental results on both simulated data and several real-world scRNA-seq datasets suggest that hGNB is a powerful tool for cell cluster discovery as well as cell lineage inference.

READ FULL TEXT

page 1

page 7

research
04/04/2021

SimCD: Simultaneous Clustering and Differential expression analysis for single-cell transcriptomic data

Single-Cell RNA sequencing (scRNA-seq) measurements have facilitated gen...
research
11/01/2019

Kinetic foundation of the zero-inflated negative binomial model for single-cell RNA sequencing data

Single-cell RNA sequencing data have complex features such as dropout ev...
research
11/24/2020

Structure learning for zero-inflated counts, with an application to single-cell RNA sequencing data

The problem of estimating the structure of a graph from observed data is...
research
03/07/2018

Differential Expression Analysis of Dynamical Sequencing Count Data with a Gamma Markov Chain

Next-generation sequencing (NGS) to profile temporal changes in living s...
research
12/06/2020

Bayesian Modeling of Spatial Molecular Profiling Data via Gaussian Process

The location, timing, and abundance of gene expression (both mRNA and pr...
research
10/25/2021

RZiMM-scRNA: A regularized zero-inflated mixture model framework for single-cell RNA-seq data

Applications of single-cell RNA sequencing in various biomedical researc...
research
05/10/2022

Bayesian clustering of multiple zero-inflated outcomes

Several applications involving counts present a large proportion of zero...

Please sign up or login with your details

Forgot password? Click here to reset