A Distributed Algorithm for Polya-Gamma Data Augmentation

The Polya-Gamma data augmentation (PG-DA) algorithm is routinely used for Bayesian inference in logistic models. This algorithm has broad applications and outperforms other sampling algorithms in terms of numerical stability and ease of implementation. The Markov chain produced by the PG-DA algorithm is also known to be uniformly ergodic; however, the PG-DA algorithm is prohibitively slow in massive data settings because it requires passing through the whole data at every iteration. We develop a simple distributed extension of the PG-DA strategy using the divide-and-conquer technique that divides the data into sufficiently large number of subsets, performs PG-type data augmentation in parallel using a powered likelihood, and produces Monte Carlo draws of the parameter by combining Markov chain Monte Carlo (MCMC) draws of parameter obtained from each subset. The combined parameter draws play the role of MCMC draws from the PG-DA algorithm in posterior inference. Our main contributions are three-fold. First, we develop the modified PG-DA algorithm with a powered likelihood in logistic models that is used on the subsets to obtain subset MCMC draws. Second, we modify the existing class of combination algorithms by introducing a scaling step. Finally, we demonstrate through diverse simulated and real data analyses that our distributed algorithm outperforms its competitors in terms of statistical accuracy and computational efficiency. We also provide theoretical support for our empirical observations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2021

Block Gibbs samplers for logistic mixed models: convergence properties and a comparison with full Gibbs samplers

Logistic linear mixed model (LLMM) is one of the most widely used statis...
research
09/18/2021

Asynchronous and Distributed Data Augmentation for Massive Data Settings

Data augmentation (DA) algorithms are widely used for Bayesian inference...
research
04/01/2018

Bayesian Mosaic: Parallelizable Composite Posterior

This paper proposes Bayesian mosaic, a parallelizable composite posterio...
research
11/12/2015

Bayesian Analysis of Dynamic Linear Topic Models

In dynamic topic modeling, the proportional contribution of a topic to a...
research
09/06/2019

Bayesian Semiparametric Estimation with Nonignorable Nonresponse

Statistical inference with nonresponse is quite challenging, especially ...
research
05/30/2021

Divide-and-Conquer Bayesian Inference in Hidden Markov Models

Divide-and-conquer Bayesian methods consist of three steps: dividing the...
research
12/20/2021

Convergence properties of data augmentation algorithms for high-dimensional robit regression

The logistic and probit link functions are the most common choices for r...

Please sign up or login with your details

Forgot password? Click here to reset