Sketching in Bayesian High Dimensional Regression With Big Data Using Gaussian Scale Mixture Priors

05/11/2021
by   Rajarshi Guhaniyogi, et al.
0

Bayesian computation of high dimensional linear regression models with a popular Gaussian scale mixture prior distribution using Markov Chain Monte Carlo (MCMC) or its variants can be extremely slow or completely prohibitive due to the heavy computational cost that grows in the cubic order of p, with p as the number of features. Although a few recently developed algorithms make the computation efficient in presence of a small to moderately large sample size (with the complexity growing in the cubic order of n), the computation becomes intractable when sample size n is also large. In this article we adopt the data sketching approach to compress the n original samples by a random linear transformation to m<<n samples in p dimensions, and compute Bayesian regression with Gaussian scale mixture prior distributions with the randomly compressed response vector and feature matrix. Our proposed approach yields computational complexity growing in the cubic order of m. Another important motivation for this compression procedure is that it anonymizes the data by revealing little information about the original data in the course of analysis. Our detailed empirical investigation with the Horseshoe prior from the class of Gaussian scale mixture priors shows closely similar inference and a massive reduction in per iteration computation time of the proposed approach compared to the regression with the full sample. One notable contribution of this article is to derive posterior contraction rate for high dimensional predictor coefficient with a general class of shrinkage priors on them under data compression/sketching. In particular, we characterize the dimension of the compressed response vector m as a function of the sample size, number of predictors and sparsity in the regression to guarantee accurate estimation of predictor coefficients asymptotically, even after data compression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2019

Ultra High-dimensional Multivariate Posterior Contraction Rate Under Shrinkage Priors

In recent years, shrinkage priors have received much attention in high-d...
research
02/26/2018

Conjugate Bayes for probit regression via unified skew-normals

Regression models for dichotomous data are ubiquitous in statistics. Bes...
research
10/29/2018

Prior-preconditioned conjugate gradient method for accelerated Gibbs sampling in "large n & large p" sparse Bayesian regression

In a modern observational study based on healthcare databases, the numbe...
research
05/01/2019

Scalable GWR: A linear-time algorithm for large-scale geographically weighted regression with polynomial kernels

While a number of studies have developed fast geographically weighted re...
research
06/04/2007

Compressed Regression

Recent research has studied the role of sparsity in high dimensional reg...
research
09/23/2020

High Dimensional Bayesian Network Classification with Network Global-Local Shrinkage Priors

This article proposes a novel Bayesian classification framework for netw...

Please sign up or login with your details

Forgot password? Click here to reset