DeepAI AI Chat
Log In Sign Up

Divide and Recombine for Large and Complex Data: Model Likelihood Functions using MCMC

by   Qi Liu, et al.

In Divide & Recombine (D&R), big data are divided into subsets, each analytic method is applied to subsets, and the outputs are recombined. This enables deep analysis and practical computational performance. An innovate D&R procedure is proposed to compute likelihood functions of data-model (DM) parameters for big data. The likelihood-model (LM) is a parametric probability density function of the DM parameters. The density parameters are estimated by fitting the density to MCMC draws from each subset DM likelihood function, and then the fitted densities are recombined. The procedure is illustrated using normal and skew-normal LMs for the logistic regression DM.


page 1

page 2

page 3

page 4


Parallelising MCMC via Random Forests

For Bayesian computation in big data contexts, the divide-and-conquer MC...

A Random Sample Partition Data Model for Big Data Analysis

Big data sets must be carefully partitioned into statistically similar d...

What is the best predictor that you can compute in five minutes using a given Bayesian hierarchical model?

The goal of this paper is to provide a way for statisticians to answer t...

Bayesian Logistic Regression for Small Areas with Numerous Households

We analyze binary data, available for a relatively large number (big dat...

Data Likelihood of Active Fires Satellite Detection and Applications to Ignition Estimation and Data Assimilation

Data likelihood of fire detection is the probability of the observed det...

On the accuracy and precision of correlation functions and field-level inference in cosmology

We present a comparative study of the accuracy and precision of correlat...

Feature subset selection for Big Data via Chaotic Binary Differential Evolution under Apache Spark

Feature subset selection (FSS) using a wrapper approach is essentially a...