Spying on the prior of the number of data clusters and the partition distribution in Bayesian cluster analysis

by   Jan Greve, et al.

Mixture models represent the key modelling approach for Bayesian cluster analysis. Different likelihood and prior specifications are required to capture the prototypical shape of the clusters. In addition, the mixture modelling approaches also crucially differ in the specification of the prior on the number of components and the prior on the component weight distribution. We investigate how these specifications impact on the implicitly induced prior on the number of 'filled' components, i.e., data clusters, and the prior on the partitions. We derive computationally feasible calculations to obtain these implicit priors for reasonable data analysis settings and make a reference implementation available in the R package 'fipp'. In many applications the implicit priors are of more practical relevance than the explicit priors imposed and thus suitable prior specifications depend on the implicit priors induced. We highlight the insights which may be gained from inspecting these implicit priors by analysing them for three different modelling approaches previously proposed for Bayesian cluster analysis. These modelling approaches consist of the Dirichlet process mixture and the static and dynamic mixture of finite mixtures model. The default priors suggested in the literature for these modelling approaches are used and the induced priors compared. Based on the implicit priors, we discuss the suitability of these modelling approaches and prior specifications when aiming at sparse cluster solutions and flexibility in the prior on the partitions.


page 1

page 2

page 3

page 4


How many data clusters are in the Galaxy data set? Bayesian cluster analysis in action

In model-based clustering, the Galaxy data set is often used as a benchm...

Dynamic mixtures of finite mixtures and telescoping sampling

Within a Bayesian framework, a comprehensive investigation of the model ...

Repulsion, Chaos and Equilibrium in Mixture Models

Mixture models are commonly used in applications with heterogeneity and ...

On fixed and uncertain mixture prior weights

This paper focuses on the specification of the weights for the component...

Contaminated Gibbs-type priors

Gibbs-type priors are widely used as key components in several Bayesian ...

Flexible Bayesian Multiple Comparison Adjustment Using Dirichlet Process and Beta-Binomial Model Priors

Researchers frequently wish to assess the equality or inequality of grou...

Bayesian semiparametric modelling of phase-varying point processes

We propose a Bayesian semiparametric approach for modelling registration...

Please sign up or login with your details

Forgot password? Click here to reset