Bayesian Hierarchical Mixture Clustering using Multilevel Hierarchical Dirichlet Processes

05/13/2019
by   Weipeng Huang, et al.
0

This paper focuses on the problem of hierarchical non-overlapping clustering of a dataset. In such a clustering, each data item is associated with exactly one leaf node and each internal node is associated with all the data items stored in the sub-tree beneath it, so that each level of the hierarchy corresponds to a partition of the dataset. We develop a novel Bayesian nonparametric method combining the nested Chinese Restaurant Process (nCRP) and the Hierarchical Dirichlet Process (HDP). Compared with other existing Bayesian approaches, our solution tackles data with complex latent mixture features which has not been previously explored in the literature. We discuss the details of the model and the inference procedure. Furthermore, experiments on three datasets show that our method achieves solid empirical results in comparison with existing algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2015

Dirichlet Fragmentation Processes

Tree structures are ubiquitous in data across many domains, and many dat...
research
01/09/2014

Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts

We present a Bayesian nonparametric framework for multilevel clustering ...
research
05/20/2020

The semi-hierarchical Dirichlet Process and its application to clustering homogeneous distributions

Assessing homogeneity of distributions is an old problem that has receiv...
research
05/14/2021

Posterior Regularisation on Bayesian Hierarchical Mixture Clustering

We study a recent inferential framework, named posterior regularisation,...
research
06/17/2019

Nested partitions from hierarchical clustering statistical validation

We develop a greedy algorithm that is fast and scalable in the detection...
research
07/31/2020

Bayesian Approaches for Flexible and Informative Clustering of Microbiome Data

We propose two unsupervised clustering methods that are designed for hum...
research
12/03/2015

CrossCat: A Fully Bayesian Nonparametric Method for Analyzing Heterogeneous, High Dimensional Data

There is a widespread need for statistical methods that can analyze high...

Please sign up or login with your details

Forgot password? Click here to reset