
Exact Subsampling MCMC
Speeding up Markov Chain Monte Carlo (MCMC) for data sets with many observations by data subsampling has recently received considerable attention in the literature. Most of the proposed methods are approximate, and the only exact solution has been documented to be highly inefficient. We propose a simulation consistent subsampling method for estimating expectations of any function of the parameters using a combination of MCMC with data subsampling and the importance sampling correction for occasionally negative likelihood estimates in Lyne et al. (2015). Our algorithm is based on first obtaining an unbiased but not necessarily positive estimate of the likelihood. The estimator uses a soft lower bound such that the likelihood estimate is positive with a high probability, and computationally cheap control variables to lower variability. Second, we carry out a correlated pseudo marginal MCMC on the absolute value of the likelihood estimate. Third, the sign of the likelihood is corrected using an importance sampling step that has low variance by construction. We illustrate the usefulness of the method with two examples.
03/27/2016 ∙ by Matias Quiroz, et al. ∙ 0 ∙ shareread it

Speeding Up MCMC by Efficient Data Subsampling
We propose Subsampling MCMC, a Markov Chain Monte Carlo (MCMC) framework where the likelihood function for n observations is estimated from a random subset of m observations. We introduce a highly efficient unbiased estimator of the loglikelihood based on control variates, such that the computing cost is much smaller than that of the full loglikelihood in standard MCMC. The likelihood estimate is biascorrected and used in two dependent pseudomarginal algorithms to sample from a perturbed posterior, for which we derive the asymptotic error with respect to n and m, respectively. Our analysis allows the number of variables p to grow with n. We propose a practical estimator of the error and show that the error is negligible even for a very small m in our applications. We demonstrate that Subsampling MCMC is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature.
04/16/2014 ∙ by Matias Quiroz, et al. ∙ 0 ∙ shareread it

Hamiltonian Monte Carlo with Energy Conserving Subsampling
Hamiltonian Monte Carlo (HMC) has recently received considerable attention in the literature due to its ability to overcome the slow exploration of the parameter space inherent in random walk proposals. In tandem, data subsampling has been extensively used to overcome the computational bottlenecks in posterior sampling algorithms that require evaluating the likelihood over the whole data set, or its gradient. However, while data subsampling has been successful in traditional MCMC algorithms such as MetropolisHastings, it has been demonstrated to be unsuccessful in the context of HMC, both in terms of poor sampling efficiency and in producing highly biased inferences. We propose an efficient HMCwithinGibbs algorithm that utilizes data subsampling to speed up computations and simulates from a slightly perturbed target, which is within O(m^2) of the true target, where m is the size of the subsample. We also show how to modify the method to obtain exact inference on any function of the parameters. Contrary to previous unsuccessful approaches, we perform subsampling in a way that conserves energy but for a modified Hamiltonian. We can therefore maintain high acceptance rates even for distant proposals. We apply the method for simulating from the posterior distribution of a highdimensional spline model for bankruptcy data and document speed ups of several orders of magnitude compare to standard HMC and, moreover, demonstrate a negligible bias.
08/02/2017 ∙ by KhueDung Dang, et al. ∙ 0 ∙ shareread it

Gaussian variational approximation for highdimensional state space models
Our article considers variational approximations of the posterior distribution in a highdimensional state space model. The variational approximation is a multivariate Gaussian density, in which the variational parameters to be optimized are a mean vector and a covariance matrix. The number of parameters in the covariance matrix grows as the square of the number of model parameters, so it is necessary to find simple yet effective parametrizations of the covariance structure when the number of model parameters is large. The joint posterior distribution over the highdimensional state vectors is approximated using a dynamic factor model, with Markovian dependence in time and a factor covariance structure for the states. This gives a reduced dimension description of the dependence structure for the states, as well as a temporal conditional independence structure similar to that in the true posterior. We illustrate our approach in two highdimensional applications which are challenging for Markov chain Monte Carlo sampling. The first is a spatiotemporal model for the spread of the Eurasian CollaredDove across North America. The second is a multivariate stochastic volatility model for financial returns via a Wishart process.
01/24/2018 ∙ by Matias Quiroz, et al. ∙ 0 ∙ shareread it

Subsampling MCMC  A review for the survey statistician
The rapid development of computing power and efficient Markov Chain Monte Carlo (MCMC) simulation algorithms have revolutionized Bayesian statistics, making it a highly practical inference method in applied work. However, MCMC algorithms tend to be computationally demanding, and are particularly slow for large datasets. Data subsampling has recently been suggested as a way to make MCMC methods scalable on massively large data, utilizing efficient sampling schemes and estimators from the survey sampling literature. These developments tend to be unknown by many survey statisticians who traditionally work with nonBayesian methods, and rarely use MCMC. Our article reviews Subsampling MCMC, a so called pseudomarginal MCMC approach to speeding up MCMC through data subsampling. The review is written for a survey statistician without previous knowledge of MCMC methods since our aim is to motivate survey sampling experts to contribute to the growing Subsampling MCMC literature.
07/23/2018 ∙ by Matias Quiroz, et al. ∙ 0 ∙ shareread it

Subsampling MCMC  An introduction for the survey statistician
The rapid development of computing power and efficient Markov Chain Monte Carlo (MCMC) simulation algorithms have revolutionized Bayesian statistics, making it a highly practical inference method in applied work. However, MCMC algorithms tend to be computationally demanding, and are particularly slow for large datasets. Data subsampling has recently been suggested as a way to make MCMC methods scalable on massively large data, utilizing efficient sampling schemes and estimators from the survey sampling literature. These developments tend to be unknown by many survey statisticians who traditionally work with nonBayesian methods, and rarely use MCMC. Our article explains the idea of data subsampling in MCMC by reviewing one strand of work, Subsampling MCMC, a so called pseudomarginal MCMC approach to speeding up MCMC through data subsampling. The review is written for a survey statistician without previous knowledge of MCMC methods since our aim is to motivate survey sampling experts to contribute to the growing Subsampling MCMC literature.
07/23/2018 ∙ by Matias Quiroz, et al. ∙ 0 ∙ shareread it

On some variance reduction properties of the reparameterization trick
The socalled reparameterization trick is widely used in variational inference as it yields more accurate estimates of the gradient of the variational objective than alternative approaches such as the score function method. The resulting optimization converges much faster as the variance reduction offered by the reparameterization gradient is typically several orders of magnitude. There is overwhelming empirical evidence in the literature showing its success. However, there is relatively little research that explores why the reparameterization gradient is so effective. We explore this under two main simplifying assumptions. First, we assume that the variational approximation is the commonly used meanfield Gaussian density. Second, we assume that the log of the joint density of the model parameter vector and the data is a quadratic function that depends on the variational mean. These assumptions allow us to obtain tractable expressions for the marginal variances of the score function and reparameterization gradient estimators. We also derive lower bounds for the score function marginal variances through RaoBlackwellization and prove that under our assumptions they are larger than those of the reparameterization trick. Finally, we apply the result of our idealized analysis to examples where the logjoint density is not quadratic, such as in a multinomial logistic regression and a Bayesian neural network with two layers.
09/27/2018 ∙ by Ming Xu, et al. ∙ 0 ∙ shareread it

Subsampling Sequential Monte Carlo for Static Bayesian Models
Our article shows how to carry out Bayesian inference by combining data subsampling with Sequential Monte Carlo (SMC). This takes advantage of the attractive properties of SMC for Bayesian computations with the ability of subsampling to tackle big data problems. SMC sequentially updates a cloud of particles through a sequence of densities, beginning with a density that is easy to sample from such as the prior and ending with the posterior density. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel and this is typically the most computationally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory efficient than the corresponding full data SMC, which is a great advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two conditional updates. First, a Hamiltonian Monte Carlo update makes distant moves for the model parameters. Second, a block pseudomarginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate the usefulness of the methodology using two large datasets.
05/08/2018 ∙ by David Gunawan, et al. ∙ 0 ∙ shareread it

Variance reduction properties of the reparameterization trick
The reparameterization trick is widely used in variational inference as it yields more accurate estimates of the gradient of the variational objective than alternative approaches such as the score function method. Although there is overwhelming empirical evidence in the literature showing its success, there is relatively little research exploring why the reparameterization trick is so effective. We explore this under the idealized assumptions that the variational approximation is a meanfield Gaussian density and that the log of the joint density of the model parameters and the data is a quadratic function that depends on the variational mean. From this, we show that the marginal variances of the reparameterization gradient estimator are smaller than those of the score function gradient estimator. We apply the result of our idealized analysis to realworld examples.
09/27/2018 ∙ by Ming Xu, et al. ∙ 0 ∙ shareread it
Matias Quiroz
is this you? claim profile