A Distributed and Integrated Method of Moments for High-Dimensional Correlated Data Analysis

10/07/2019
by   Emily C. Hector, et al.
0

This paper is motivated by a regression analysis of electroencephalography (EEG) neuroimaging data with high-dimensional correlated responses with multi-level nested correlations. We develop a divide-and-conquer procedure implemented in a fully distributed and parallelized computational scheme for statistical estimation and inference of regression parameters. Despite significant efforts in the literature, the computational bottleneck associated with high-dimensional likelihoods prevents the scalability of existing methods. The proposed method addresses this challenge by dividing responses into subvectors to be analyzed separately and in parallel on a distributed platform using pairwise composite likelihood. Theoretical challenges related to combining results from dependent data are overcome in a statistically efficient way using a meta-estimator derived from Hansen's generalized method of moments. We provide a rigorous theoretical framework for efficient estimation, inference, and goodness-of-fit tests. We develop an R package for ease of implementation. We illustrate our method's performance with simulations and the analysis of the EEG data, and find that iron deficiency is significantly associated with two auditory recognition memory related potentials in the left parietal-occipital region of the brain.

READ FULL TEXT
research
07/16/2020

Doubly Distributed Supervised Learning and Inference with High-Dimensional Correlated Outcomes

This paper presents a unified framework for supervised learning and infe...
research
11/30/2020

Joint integrative analysis of multiple data sources with correlated vector outcomes

We propose a distributed quadratic inference function framework to joint...
research
05/25/2023

Distributed model building and recursive integration for big spatial data modeling

Motivated by the important need for computationally tractable statistica...
research
07/26/2022

Functional Regression with Intensively Measured Longitudinal Outcomes: A New Lens through Data Partitioning

Modern longitudinal data from wearable devices consist of biological sig...
research
07/24/2022

Statistical inference for high-dimensional generalized estimating equations

We propose a novel inference procedure for linear combinations of high-d...
research
02/16/2019

Privacy Preserving Integrative Regression Analysis of High-dimensional Heterogeneous Data

Meta-analyzing multiple studies, enabling more precise estimation and in...
research
05/27/2019

Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations

Complex phenomena are often modeled with computationally intensive feed-...

Please sign up or login with your details

Forgot password? Click here to reset