Coarsened mixtures of hierarchical skew normal kernels for flow cytometry analyses

01/17/2020
by   Shai Gorsky, et al.
0

Flow cytometry (FCM) is the standard multi-parameter assay used to measure single cell phenotype and functionality. It is commonly used to quantify the relative frequencies of cell subsets in blood and disaggregated tissues. A typical analysis of FCM data involves cell classification - the identification of cell subgroups in the sample - and comparisons of the cell subgroups across samples. While modern experiments often necessitate the collection and processing of samples in multiple batches, analysis of FCM data across batches is challenging because the locations in the marker space of cell subsets may vary across samples. Differences across samples may occur because of true biological variation or technical reasons such as antibody lot effects or instrument optics. An important step in comparative analyses of multi-sample FCM data is cross-sample calibration, whose goal is to align cell subsets across multiple samples in the presence of variations in locations, so that variation due to technical reasons is minimized and true biological variation can be meaningfully compared. We introduce a Bayesian nonparametric hierarchical modeling approach for accomplishing calibration and cell classification simultaneously in a unified probabilistic manner. Three important features of our method make it particularly effective for analyzing multi-sample FCM data: a nonparametric mixture avoids prespecifying the number of cell clusters; the hierarchical skew normal kernels allow flexibility in the shapes of the cell subsets and cross-sample variation in their locations; and finally the "coarsening" strategy makes inference robust to small departures from the model, a feature that becomes crucial with massive numbers of observations such as those encountered in FCM data. We demonstrate the merits of our approach in simulated examples and carry out a case study in the analysis of two FCM data sets.

READ FULL TEXT

page 18

page 19

page 21

page 35

page 36

research
04/17/2017

Mixture modeling on related samples by ψ-stick breaking and kernel perturbation

There has been great interest recently in applying nonparametric kernel ...
research
02/20/2020

A Bayesian Feature Allocation Model for Identification of Cell Subpopulations Using Cytometry Data

A Bayesian feature allocation model (FAM) is presented for identifying c...
research
12/07/2018

METCC: METric learning for Confounder Control Making distance matter in high dimensional biological analysis

High-dimensional data acquired from biological experiments such as next ...
research
02/14/2017

Sequential Dirichlet Process Mixtures of Multivariate Skew t-distributions for Model-based Clustering of Flow Cytometry Data

Flow cytometry is a high-throughput technology used to quantify multiple...
research
05/31/2013

Joint Modeling and Registration of Cell Populations in Cohorts of High-Dimensional Flow Cytometric Data

In systems biomedicine, an experimenter encounters different potential s...
research
11/11/2014

Supervised Classification of Flow Cytometric Samples via the Joint Clustering and Matching (JCM) Procedure

We consider the use of the Joint Clustering and Matching (JCM) procedure...
research
11/27/2022

An Empirical Bayes Approach for Constructing the Confidence Intervals of Clonality and Entropy

This paper is motivated by the need to quantify human immune responses t...

Please sign up or login with your details

Forgot password? Click here to reset