Kernel Biclustering algorithm in Hilbert Spaces

08/07/2022
by   Marcos Matabuena, et al.
12

Biclustering algorithms partition data and covariates simultaneously, providing new insights in several domains, such as analyzing gene expression to discover new biological functions. This paper develops a new model-free biclustering algorithm in abstract spaces using the notions of energy distance (ED) and the maximum mean discrepancy (MMD) – two distances between probability distributions capable of handling complex data such as curves or graphs. The proposed method can learn more general and complex cluster shapes than most existing literature approaches, which usually focus on detecting mean and variance differences. Although the biclustering configurations of our approach are constrained to create disjoint structures at the datum and covariate levels, the results are competitive. Our results are similar to state-of-the-art methods in their optimal scenarios, assuming a proper kernel choice, outperforming them when cluster differences are concentrated in higher-order moments. The model's performance has been tested in several situations that involve simulated and real-world datasets. Finally, new theoretical consistency results are established using some tools of the theory of optimal transport.

READ FULL TEXT

page 14

page 16

research
05/31/2019

Quantum Mean Embedding of Probability Distributions

The kernel mean embedding of probability distributions is commonly used ...
research
05/31/2022

An optimal transport approach for selecting a representative subsample with application in efficient kernel density estimation

Subsampling methods aim to select a subsample as a surrogate for the obs...
research
02/17/2022

Information Theory with Kernel Methods

We consider the analysis of probability distributions through their asso...
research
12/02/2019

A Rigorous Theory of Conditional Mean Embeddings

Conditional mean embeddings (CME) have proven themselves to be a powerfu...
research
01/29/2023

Kernelized Cumulants: Beyond Kernel Mean Embeddings

In ℝ^d, it is well-known that cumulants provide an alternative to moment...
research
04/05/2021

Quantized Gromov-Wasserstein

The Gromov-Wasserstein (GW) framework adapts ideas from optimal transpor...
research
02/28/2017

Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning

The learning of domain-invariant representations in the context of domai...

Please sign up or login with your details

Forgot password? Click here to reset