Learning Causal Representations of Single Cells via Sparse Mechanism Shift Modeling

11/07/2022
by   Romain Lopez, et al.
0

Latent variable models such as the Variational Auto-Encoder (VAE) have become a go-to tool for analyzing biological data, especially in the field of single-cell genomics. One remaining challenge is the interpretability of latent variables as biological processes that define a cell's identity. Outside of biological applications, this problem is commonly referred to as learning disentangled representations. Although several disentanglement-promoting variants of the VAE were introduced, and applied to single-cell genomics data, this task has been shown to be infeasible from independent and identically distributed measurements, without additional structure. Instead, recent methods propose to leverage non-stationary data, as well as the sparse mechanism shift assumption in order to learn disentangled representations with a causal semantic. Here, we extend the application of these methodological advances to the analysis of single-cell genomics data with genetic or chemical perturbations. More precisely, we propose a deep generative model of single-cell gene expression data for which each perturbation is treated as a stochastic intervention targeting an unknown, but sparse, subset of latent variables. We benchmark these methods on simulated single-cell data to evaluate their performance at latent units recovery, causal target identification and out-of-domain generalization. Finally, we apply those approaches to two real-world large-scale gene perturbation data sets and find that models that exploit the sparse mechanism shift hypothesis surpass contemporary methods on a transfer learning task. We implement our new model and benchmarks using the scvi-tools library, and release it as open-source software at https://github.com/Genentech/sVAE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2022

CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data

Mapping biological mechanisms in cellular systems is a fundamental step ...
research
09/07/2017

A deep generative model for gene expression profiles from single-cell RNA sequencing

We propose a probabilistic model for interpreting gene expression levels...
research
07/02/2023

Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity

This paper presents a novel approach that leverages domain variability t...
research
04/17/2023

Causal Disentangled Variational Auto-Encoder for Preference Understanding in Recommendation

Recommendation models are typically trained on observational user intera...
research
10/13/2017

A deep generative model for single-cell RNA sequencing with application to detecting differentially expressed genes

We propose a probabilistic model for interpreting gene expression levels...
research
10/25/2022

A single-cell gene expression language model

Gene regulation is a dynamic process that connects genotype and phenotyp...
research
06/08/2023

Subject clustering by IF-PCA and several recent methods

Subject clustering (i.e., the use of measured features to cluster subjec...

Please sign up or login with your details

Forgot password? Click here to reset