Bridging Distribution Learning and Image Clustering in High-dimensional Space

08/29/2023
by   Guanfang Dong, et al.
0

Distribution learning focuses on learning the probability density function from a set of data samples. In contrast, clustering aims to group similar objects together in an unsupervised manner. Usually, these two tasks are considered unrelated. However, the relationship between the two may be indirectly correlated, with Gaussian Mixture Models (GMM) acting as a bridge. In this paper, we focus on exploring the correlation between distribution learning and clustering, with the motivation to fill the gap between these two fields, utilizing an autoencoder (AE) to encode images into a high-dimensional latent space. Then, Monte-Carlo Marginalization (MCMarg) and Kullback-Leibler (KL) divergence loss are used to fit the Gaussian components of the GMM and learn the data distribution. Finally, image clustering is achieved through each Gaussian component of GMM. Yet, the "curse of dimensionality" poses severe challenges for most clustering algorithms. Compared with the classic Expectation-Maximization (EM) Algorithm, experimental results show that MCMarg and KL divergence can greatly alleviate the difficulty. Based on the experimental results, we believe distribution learning can exploit the potential of GMM in image clustering within high-dimensional space.

READ FULL TEXT

page 1

page 6

research
11/15/2017

Sliced Wasserstein Distance for Learning Gaussian Mixture Models

Gaussian mixture models (GMM) are powerful parametric tools with many ap...
research
08/19/2019

Quantum Expectation-Maximization Algorithm

Clustering algorithms are a cornerstone of machine learning applications...
research
11/30/2022

High-Dimensional Wide Gap k-Means Versus Clustering Axioms

Kleinberg's axioms for distance based clustering proved to be contradict...
research
12/11/2022

Stochastic First-Order Learning for Large-Scale Flexibly Tied Gaussian Mixture Model

Gaussian Mixture Models (GMM) are one of the most potent parametric dens...
research
08/02/2022

Cluster Weighted Model Based on TSNE algorithm for High-Dimensional Data

Similar to many Machine Learning models, both accuracy and speed of the ...
research
03/28/2022

Learning Sparse Mixture Models

This work approximates high-dimensional density functions with an ANOVA-...
research
12/16/2021

High-dimensional logistic entropy clustering

Minimization of the (regularized) entropy of classification probabilitie...

Please sign up or login with your details

Forgot password? Click here to reset