Approximate Inference via Clustering

11/28/2021
by   Qianqian Song, et al.
0

In recent years, large-scale Bayesian learning draws a great deal of attention. However, in big-data era, the amount of data we face is growing much faster than our ability to deal with it. Fortunately, it is observed that large-scale datasets usually own rich internal structure and is somewhat redundant. In this paper, we attempt to simplify the Bayesian posterior via exploiting this structure. Specifically, we restrict our interest to the so-called well-clustered datasets and construct an approximate posterior according to the clustering information. Fortunately, the clustering structure can be efficiently obtained via a particular clustering algorithm. When constructing the approximate posterior, the data points in the same cluster are all replaced by the centroid of the cluster. As a result, the posterior can be significantly simplified. Theoretically, we show that under certain conditions the approximate posterior we construct is close (measured by KL divergence) to the exact posterior. Furthermore, thorough experiments are conducted to validate the fact that the constructed posterior is a good approximation to the true posterior and much easier to sample from.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/19/2015

Structure Learning in Bayesian Networks of Moderate Size by Efficient Sampling

We study the Bayesian model averaging approach to learning Bayesian netw...
research
06/09/2015

Provable Bayesian Inference via Particle Mirror Descent

Bayesian methods are appealing in their flexibility in modeling complex ...
research
06/19/2020

Distortion estimates for approximate Bayesian inference

Current literature on posterior approximation for Bayesian inference off...
research
12/05/2017

Measuring Cluster Stability for Bayesian Nonparametrics Using the Linear Bootstrap

Clustering procedures typically estimate which data points are clustered...
research
07/25/2023

DBGSA: A Novel Data Adaptive Bregman Clustering Algorithm

With the development of Big data technology, data analysis has become in...
research
03/18/2022

Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

Bayesian coresets approximate a posterior distribution by building a sma...
research
02/05/2016

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

The amount of data available in the world is growing faster than our abi...

Please sign up or login with your details

Forgot password? Click here to reset