Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

09/26/2013
by   Amar Shah, et al.
0

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional data well and can encounter difficulties in inference. We present a novel nonparameteric Bayesian kernel based method to cluster data points without the need to prespecify the number of clusters or to model complicated densities from which data points are assumed to be generated from. The key insight is to use determinants of submatrices of a kernel matrix as a measure of how close together a set of points are. We explore some theoretical properties of the model and derive a natural Gibbs based algorithm with MCMC hyperparameter learning. The model is implemented on a variety of synthetic and real world data sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2019

Subspace clustering without knowing the number of clusters: A parameter free approach

Subspace clustering, the task of clustering high dimensional data when t...
research
03/16/2020

A semi-supervised sparse K-Means algorithm

We consider the problem of data clustering with unidentified feature qua...
research
10/21/2015

Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

We propose a novel method for multiple clustering that assumes a co-clus...
research
05/04/2017

Semi-supervised model-based clustering with controlled clusters leakage

In this paper, we focus on finding clusters in partially categorized dat...
research
10/19/2018

Bayesian Distance Clustering

Model-based clustering is widely-used in a variety of application areas....
research
06/12/2023

A Computational Theory and Semi-Supervised Algorithm for Clustering

A computational theory for clustering and a semi-supervised clustering a...
research
07/29/2022

Bayesian nonparametric mixture inconsistency for the number of components: How worried should we be in practice?

We consider the Bayesian mixture of finite mixtures (MFMs) and Dirichlet...

Please sign up or login with your details

Forgot password? Click here to reset