Powered Dirichlet Process for Controlling the Importance of "Rich-Get-Richer" Prior Assumptions in Bayesian Clustering

04/26/2021
by   Gaël Poux-Médard, et al.
0

One of the most used priors in Bayesian clustering is the Dirichlet prior. It can be expressed as a Chinese Restaurant Process. This process allows nonparametric estimation of the number of clusters when partitioning datasets. Its key feature is the "rich-get-richer" property, which assumes a cluster has an a priori probability to get chosen linearly dependent on population. In this paper, we show that such prior is not always the best choice to model data. We derive the Powered Chinese Restaurant process from a modified version of the Dirichlet-Multinomial distribution to answer this problem. We then develop some of its fundamental properties (expected number of clusters, convergence). Unlike state-of-the-art efforts in this direction, this new formulation allows for direct control of the importance of the "rich-get-richer" prior.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/15/2018

Reducing over-clustering via the powered Chinese restaurant process

Dirichlet process mixture (DPM) models tend to produce many small cluste...
09/15/2021

Powered Hawkes-Dirichlet Process: Challenging Textual Clustering using a Flexible Temporal Prior

The textual content of a document and its publication date are intertwin...
10/15/2018

Evaluating Sensitivity to the Stick Breaking Prior in Bayesian Nonparametrics

A central question in many probabilistic clustering problems is how many...
11/04/2014

Simple approximate MAP Inference for Dirichlet processes

The Dirichlet process mixture (DPM) is a ubiquitous, flexible Bayesian n...
01/18/2022

Flexible clustering via hidden hierarchical Dirichlet priors

The Bayesian approach to inference stands out for naturally allowing bor...
01/01/2018

An elementary derivation of the Chinese restaurant process from Sethuraman's stick-breaking process

The Chinese restaurant process and the stick-breaking process are the tw...
11/21/2012

Bayesian nonparametric Plackett-Luce models for the analysis of preferences for college degree programmes

In this paper we propose a Bayesian nonparametric model for clustering p...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.