Modelling Technical and Biological Effects in scRNA-seq data with Scalable GPLVMs

09/14/2022
by   Vidhi Lalchand, et al.
0

Single-cell RNA-seq datasets are growing in size and complexity, enabling the study of cellular composition changes in various biological/clinical contexts. Scalable dimensionality reduction techniques are in need to disentangle biological variation in them, while accounting for technical and biological confounders. In this work, we extend a popular approach for probabilistic non-linear dimensionality reduction, the Gaussian process latent variable model, to scale to massive single-cell datasets while explicitly accounting for technical and biological confounders. The key idea is to use an augmented kernel which preserves the factorisability of the lower bound allowing for fast stochastic variational inference. We demonstrate its ability to reconstruct latent signatures of innate immunity recovered in Kumasaka et al. (2021) with 9x lower training time. We further analyze a COVID dataset and demonstrate across a cohort of 130 individuals, that this framework enables data integration while capturing interpretable signatures of infection. Specifically, we explore COVID severity as a latent dimension to refine patient stratification and capture disease-specific gene expression.

READ FULL TEXT

page 4

page 9

page 10

page 12

page 13

page 14

page 15

research
02/25/2022

Generalised Gaussian Process Latent Variable Models (GPLVM) with Stochastic Variational Inference

Gaussian process latent variable models (GPLVM) are a flexible and non-l...
research
10/16/2018

Covariate Gaussian Process Latent Variable Models

Gaussian Process Regression (GPR) and Gaussian Process Latent Variable M...
research
03/06/2020

BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders

Variational Autoencoders (VAEs) provide a flexible and scalable framewor...
research
10/06/2020

Gene Regulatory Network Inference with Latent Force Models

Delays in protein synthesis cause a confounding effect when constructing...
research
09/07/2017

A deep generative model for gene expression profiles from single-cell RNA sequencing

We propose a probabilistic model for interpreting gene expression levels...
research
07/02/2023

Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity

This paper presents a novel approach that leverages domain variability t...
research
05/06/2023

A Nonparametric Mixed-Effects Mixture Model for Patterns of Clinical Measurements Associated with COVID-19

Some patients with COVID-19 show changes in signs and symptoms such as t...

Please sign up or login with your details

Forgot password? Click here to reset