Reconstructing probabilistic trees of cellular differentiation from single-cell RNA-seq data

by   Miriam Shiffman, et al.

Until recently, transcriptomics was limited to bulk RNA sequencing, obscuring the underlying expression patterns of individual cells in favor of a global average. Thanks to technological advances, we can now profile gene expression across thousands or millions of individual cells in parallel. This new type of data has led to the intriguing discovery that individual cell profiles can reflect the imprint of time or dynamic processes. However, synthesizing this information to reconstruct dynamic biological phenomena from data that are noisy, heterogenous, and sparse---and from processes that may unfold asynchronously---poses a complex computational and statistical challenge. Here, we develop a full generative model for probabilistically reconstructing trees of cellular differentiation from single-cell RNA-seq data. Specifically, we extend the framework of the classical Dirichlet diffusion tree to simultaneously infer branch topology and latent cell states along continuous trajectories over the full tree. In tandem, we construct a novel Markov chain Monte Carlo sampler that interleaves Metropolis-Hastings and message passing to leverage model structure for efficient inference. Finally, we demonstrate that these techniques can recover latent trajectories from simulated single-cell transcriptomes. While this work is motivated by cellular differentiation, we derive a tractable model that provides flexible densities for any data (coupled with an appropriate noise model) that arise from continuous evolution along a latent nonparametric tree.


page 1

page 2

page 3

page 4


Variational Mixtures of ODEs for Inferring Cellular Gene Expression Dynamics

A key problem in computational biology is discovering the gene expressio...

Revolutionizing Single Cell Analysis: The Power of Large Language Models for Cell Type Annotation

In recent years, single cell RNA sequencing has become a widely used tec...

Granger causal inference on DAGs identifies genomic loci regulating transcription

When a dynamical system can be modeled as a sequence of observations, Gr...

Learning Anisotropic Interaction Rules from Individual Trajectories in a Heterogeneous Cellular Population

Interacting particle system (IPS) models have proven to be highly succes...

Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder

Single cell RNA sequencing (scRNA-seq) data makes studying the developme...

Estimation of cell lineage trees by maximum-likelihood phylogenetics

CRISPR technology has enabled large-scale cell lineage tracing for compl...

Cellular liberality is measurable as Lempel-Ziv complexity of fastq files

Many studies used the Shannon entropy of transcriptome data to determine...

Please sign up or login with your details

Forgot password? Click here to reset