Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

10/10/2016
by   Thibault Lesieur, et al.
0

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of m points in n dimensions, n,m →∞ and α = m/n stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of α and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, r > 4 + 2 √(α), there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2020

The role of regularization in classification of high-dimensional noisy Gaussian mixture

We consider a high-dimensional mixture of two Gaussians in the noisy reg...
research
12/01/2019

On the optimality of kernels for high-dimensional clustering

This paper studies the optimality of kernel methods in high-dimensional ...
research
02/13/2021

ThetA – fast and robust clustering via a distance parameter

Clustering is a fundamental problem in machine learning where distance-b...
research
08/10/2021

Correlation Clustering Reconstruction in Semi-Adversarial Models

Correlation Clustering is an important clustering problem with many appl...
research
09/05/2023

Superclustering by finding statistically significant separable groups of optimal gaussian clusters

The paper presents the algorithm for clustering a dataset by grouping th...
research
10/15/2020

Cascade of Phase Transitions for Multi-Scale Clustering

We present a novel framework exploiting the cascade of phase transitions...
research
05/26/2022

Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

A simple model to study subspace clustering is the high-dimensional k-Ga...

Please sign up or login with your details

Forgot password? Click here to reset