Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data

09/19/2017
by   Ruohui Wang, et al.
0

We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in distributed environments, where data are distributed across multiple computing nodes. A key advantage of Bayesian nonparametric models such as DPMMs is that they allow new components to be introduced on the fly as needed. This, however, posts an important challenge to distributed estimation -- how to handle new components efficiently and consistently. To tackle this problem, we propose a new estimation method, which allows new components to be created locally in individual computing nodes. Components corresponding to the same cluster will be identified and merged via a probabilistic consolidation scheme. In this way, we can maintain the consistency of estimation with very low communication cost. Experiments on large real-world data sets show that the proposed method can achieve high scalability in distributed and asynchronous environments without compromising the mixing performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2015

Dirichlet Process Parsimonious Mixtures for clustering

The parsimonious Gaussian mixture models, which exploit an eigenvalue de...
research
11/29/2012

Exact and Efficient Parallel Inference for Nonparametric Mixture Models

Nonparametric mixture models based on the Dirichlet process are an elega...
research
06/27/2012

Gibbs Sampling for (Coupled) Infinite Mixture Models in the Stick Breaking Representation

Nonparametric Bayesian approaches to clustering, information retrieval, ...
research
06/19/2019

Importance conditional sampling for Bayesian nonparametric mixtures

Nonparametric mixture models based on the Pitman-Yor process represent a...
research
05/19/2020

Mixture Models and Networks – Overview of Stochastic Blockmodelling

Mixture models are probabilistic models aimed at uncovering and represen...
research
07/27/2018

Infinite Mixture of Inverted Dirichlet Distributions

In this work, we develop a novel Bayesian estimation method for the Diri...
research
05/24/2016

Consistency Analysis for the Doubly Stochastic Dirichlet Process

This technical report proves components consistency for the Doubly Stoch...

Please sign up or login with your details

Forgot password? Click here to reset