Geodesic Sinkhorn: optimal transport for high-dimensional datasets

11/02/2022
by   Guillaume Huguet, et al.
0

Understanding the dynamics and reactions of cells from population snapshots is a major challenge in single-cell transcriptomics. Here, we present Geodesic Sinkhorn, a method for interpolating populations along a data manifold that leverages existing kernels developed for single-cell dimensionality reduction and visualization methods. Our Geodesic Sinkhorn method uses a heat-geodesic ground distance that, as compared to Euclidean ground distances, is more accurate for interpolating single-cell dynamics on a wide variety of datasets and significantly speeds up the computation for sparse kernels. We first apply Geodesic Sinkhorn to 10 single-cell transcriptomics time series interpolation datasets as a drop-in replacement for existing interpolation methods where it outperforms on all datasets, showing its effectiveness in modeling cell dynamics. Second, we show how to efficiently approximate the operator with polynomial kernels allowing us to improve scaling to large datasets. Finally, we define the conditional Wasserstein-average treatment effect and show how it can elucidate the treatment effect on single-cell populations on a drug screen.

READ FULL TEXT
research
05/30/2023

A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction

Diffusion-based manifold learning methods have proven useful in represen...
research
10/15/2021

SGEN: Single-cell Sequencing Graph Self-supervised Embedding Network

Single-cell sequencing has a significant role to explore biological proc...
research
02/11/2021

Unsupervised Ground Metric Learning using Wasserstein Eigenvectors

Optimal Transport (OT) defines geometrically meaningful "Wasserstein" di...
research
11/19/2022

BENK: The Beran Estimator with Neural Kernels for Estimating the Heterogeneous Treatment Effect

A method for estimating the conditional average treatment effect under c...
research
09/30/2022

Neural Unbalanced Optimal Transport via Cycle-Consistent Semi-Couplings

Comparing unpaired samples of a distribution or population taken at diff...
research
02/04/2011

Collective Classification of Textual Documents by Guided Self-Organization in T-Cell Cross-Regulation Dynamics

We present and study an agent-based model of T-Cell cross-regulation in ...
research
10/30/2015

Principal Differences Analysis: Interpretable Characterization of Differences between Distributions

We introduce principal differences analysis (PDA) for analyzing differen...

Please sign up or login with your details

Forgot password? Click here to reset