MCA: Multiresolution Correlation Analysis, a graphical tool for subpopulation identification in single-cell gene expression data

07/08/2014
by   Justin Feigelman, et al.
0

Background: Biological data often originate from samples containing mixtures of subpopulations, corresponding e.g. to distinct cellular phenotypes. However, identification of distinct subpopulations may be difficult if biological measurements yield distributions that are not easily separable. Results: We present Multiresolution Correlation Analysis (MCA), a method for visually identifying subpopulations based on the local pairwise correlation between covariates, without needing to define an a priori interaction scale. We demonstrate that MCA facilitates the identification of differentially regulated subpopulations in simulated data from a small gene regulatory network, followed by application to previously published single-cell qPCR data from mouse embryonic stem cells. We show that MCA recovers previously identified subpopulations, provides additional insight into the underlying correlation structure, reveals potentially spurious compartmentalizations, and provides insight into novel subpopulations. Conclusions: MCA is a useful method for the identification of subpopulations in low-dimensional expression data, as emerging from qPCR or FACS measurements. With MCA it is possible to investigate the robustness of covariate correlations with respect subpopulations, graphically identify outliers, and identify factors contributing to differential regulation between pairs of covariates. MCA thus provides a framework for investigation of expression correlations for genes of interests and biological hypothesis generation.

READ FULL TEXT

page 1

page 3

research
01/12/2016

Robust Lineage Reconstruction from High-Dimensional Single-Cell Data

Single-cell gene expression data provide invaluable resources for system...
research
12/13/2022

Multiscale topology classifies and quantifies cell types in subcellular spatial transcriptomics

Spatial transcriptomics has the potential to transform our understanding...
research
05/19/2023

Structured factorization for single-cell gene expression data

Single-cell gene expression data are often characterized by large matric...
research
11/30/2022

A Pseudo-Value Regression Approach for Differential Network Analysis of Co-Expression Data

The differential network (DN) analysis identifies changes in measures of...
research
02/12/2021

Contrastive latent variable modeling with application to case-control sequencing experiments

High-throughput RNA-sequencing (RNA-seq) technologies are powerful tools...
research
01/03/2018

Accounting for unobserved covariates with varying degrees of estimability in high dimensional biological data

An important phenomenon in high dimensional biological data is the prese...
research
09/17/2020

Identification of Biomarkers Controlling Cell Fate In Blood Cell Development

A blood cell lineage consists of several consecutive developmental stage...

Please sign up or login with your details

Forgot password? Click here to reset