Orthogonalization of data via Gromov-Wasserstein type feedback for clustering and visualization

07/25/2022
by   Martin Ryner, et al.
0

In this paper we propose an adaptive approach for clustering and visualization of data by an orthogonalization process. Starting with the data points being represented by a Markov process using the diffusion map framework, the method adaptively increase the orthogonality of the clusters by applying a feedback mechanism inspired by the Gromov-Wasserstein distance. This mechanism iteratively increases the spectral gap and refines the orthogonality of the data to achieve a clustering with high specificity. By using the diffusion map framework and representing the relation between data points using transition probabilities, the method is robust with respect to both the underlying distance, noise in the data and random initialization. We prove that the method converges globally to a unique fixpoint for certain parameter values. We also propose a related approach where the transition probabilities in the Markov process are required to be doubly stochastic, in which case the method generates a minimizer to a nonconvex optimization problem. We apply the method on cryo-electron microscopy image data from biopharmaceutical manufacturing where we can confirm biologically relevant insights related to therapeutic efficacy. We consider an example with morphological variations of gene packaging and confirm that the method produces biologically meaningful clustering results consistent with human expert classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2019

Basic Principles of Clustering Methods

Clustering methods group a set of data points into a few coherent groups...
research
03/28/2022

Time-inhomogeneous diffusion geometry and topology

Diffusion condensation is a dynamic process that yields a sequence of mu...
research
05/25/2019

A New Clustering Method Based on Morphological Operations

With the booming development of data science, many clustering methods ha...
research
07/15/2015

Unsupervised Decision Forest for Data Clustering and Density Estimation

An algorithm to improve performance parameter for unsupervised decision ...
research
05/22/2017

Online Factorization and Partition of Complex Networks From Random Walks

Finding the reduced-dimensional structure is critical to understanding c...
research
12/16/2020

Predictive K-means with local models

Supervised classification can be effective for prediction but sometimes ...
research
09/15/2015

The Shape of Data and Probability Measures

We introduce the notion of multiscale covariance tensor fields (CTF) ass...

Please sign up or login with your details

Forgot password? Click here to reset