A generalized Bayes framework for probabilistic clustering

06/09/2020
by   Tommaso Rigon, et al.
11

Loss-based clustering methods, such as k-means and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative, but such methods face computational problems and large sensitivity to the choice of kernel. This article proposes a generalized Bayes framework that bridges between these two paradigms through the use of Gibbs posteriors. In conducting Bayesian updating, the log likelihood is replaced by a loss function for clustering, leading to a rich family of clustering methods. The Gibbs posterior represents a coherent updating of Bayesian beliefs without needing to specify a likelihood for the data, and can be used for characterizing uncertainty in clustering. We consider losses based on Bregman divergence and pairwise similarities, and develop efficient deterministic algorithms for point estimation along with sampling algorithms for uncertainty quantification. Several existing clustering algorithms, including k-means, can be interpreted as generalized Bayes estimators under our framework, and hence we provide a method of uncertainty quantification for these approaches.

READ FULL TEXT

page 22

page 25

page 27

research
02/13/2023

Reliable Bayesian Inference in Misspecified Models

We provide a general solution to a fundamental open problem in Bayesian ...
research
10/13/2022

Probabilistic Approach to Parameteric Inverse Problems Using Gibbs Posteriors

We propose a general framework for obtaining probabilistic solutions to ...
research
12/10/2012

MAD-Bayes: MAP-based Asymptotic Derivations from Bayes

The classical mixture of Gaussians model is related to K-means via small...
research
06/02/2018

Optimal Clustering under Uncertainty

Classical clustering algorithms typically either lack an underlying prob...
research
02/05/2018

Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent

Coherent uncertainty quantification is a key strength of Bayesian method...
research
11/18/2017

The Bayes Lepski's Method and Credible Bands through Volume of Tubular Neighborhoods

For a general class of priors based on random series basis expansion, we...
research
08/25/2015

Clustering With Side Information: From a Probabilistic Model to a Deterministic Algorithm

In this paper, we propose a model-based clustering method (TVClust) that...

Please sign up or login with your details

Forgot password? Click here to reset