optimalFlow: Optimal-transport approach to flow cytometry gating and population matching

07/18/2019
by   Eustasio del Barrio, et al.
0

Data used in Flow Cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well known phenomenon produced by measurements on different individuals, with different characteristics such as age, sex, etc... The use of different settings for measurement, the variation of the conditions during experiments or the different types of flow cytometers are some of the technical sources of variability. This high variability makes difficult the use of supervised machine learning for identification of cell populations. We propose optimalFlowTemplates, based on a similarity distance and Wasserstein barycenters, which clusterizes cytometries and produces prototype cytometries for the different groups. We show that supervised learning restricted to the new groups performs better than the same techniques applied to the whole collection. We also present optimalFlowClassification, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show that this procedure can outperform state of the art techniques in the proposed datasets. Our code and data are freely available as R packages at https://github.com/HristoInouzhe/optimalFlow and https://github.com/HristoInouzhe/optimalFlowData.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2020

CytOpT: Optimal Transport with Domain Adaptation for Interpreting Flow Cytometry data

The automated analysis of flow cytometry measurements is an active resea...
research
12/14/2021

Inductive Semi-supervised Learning Through Optimal Transport

In this paper, we tackle the inductive semi-supervised learning problem ...
research
10/27/2022

Supervised Contrastive Learning for Respiratory Sound Classification

Automatic respiratory sound classification using machine learning is a c...
research
02/02/2018

Voting patterns in 2016: Exploration using multilevel regression and poststratification (MRP) on pre-election polls

We analyzed 2012 and 2016 YouGov pre-election polls in order to understa...
research
05/23/2023

A Laplacian Pyramid Based Generative H E Stain Augmentation Network

Hematoxylin and Eosin (H E) staining is a widely used sample preparati...
research
11/23/2021

Binned multinomial logistic regression for integrative cell type annotation

Categorizing individual cells into one of many known cell type categorie...
research
06/22/2020

An Optimal Transport Kernel for Feature Aggregation and its Relationship to Attention

We introduce a kernel for sets of features based on an optimal transport...

Please sign up or login with your details

Forgot password? Click here to reset