Bayesian Hierarchical Clustering with Exponential Family: Small-Variance Asymptotics and Reducibility

01/29/2015
by   Juho Lee, et al.
0

Bayesian hierarchical clustering (BHC) is an agglomerative clustering method, where a probabilistic model is defined and its marginal likelihoods are evaluated to decide which clusters to merge. While BHC provides a few advantages over traditional distance-based agglomerative clustering algorithms, successive evaluation of marginal likelihoods and careful hyperparameter tuning are cumbersome and limit the scalability. In this paper we relax BHC into a non-probabilistic formulation, exploring small-variance asymptotics in conjugate-exponential models. We develop a novel clustering algorithm, referred to as relaxed BHC (RBHC), from the asymptotic limit of the BHC model that exhibits the scalability of distance-based agglomerative clustering algorithms as well as the flexibility of Bayesian nonparametric models. We also investigate the reducibility of the dissimilarity measure emerged from the asymptotic limit of the BHC model, allowing us to use scalable algorithms such as the nearest neighbor chain algorithm. Numerical experiments on both synthetic and real-world datasets demonstrate the validity and high performance of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2018

Hierarchical Graph Clustering using Node Pair Sampling

We present a novel hierarchical graph clustering algorithm inspired by m...
research
07/26/2017

Dynamic Clustering Algorithms via Small-Variance Analysis of Markov Chain Mixture Models

Bayesian nonparametrics are a class of probabilistic models in which the...
research
11/02/2011

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Bayesian models offer great flexibility for clustering applications---Ba...
research
04/20/2015

Nonparametric Nearest Neighbor Random Process Clustering

We consider the problem of clustering noisy finite-length observations o...
research
08/25/2015

Clustering With Side Information: From a Probabilistic Model to a Deterministic Algorithm

In this paper, we propose a model-based clustering method (TVClust) that...
research
10/22/2020

Scalable Bottom-Up Hierarchical Clustering

Bottom-up algorithms such as the classic hierarchical agglomerative clus...
research
02/28/2019

Efficient Parameter-free Clustering Using First Neighbor Relations

We present a new clustering method in the form of a single clustering eq...

Please sign up or login with your details

Forgot password? Click here to reset