Simultaneous Inference for Massive Data: Distributed Bootstrap

02/19/2020
by   Yang Yu, et al.
0

In this paper, we propose a bootstrap method applied to massive data processed distributedly in a large number of machines. This new method is computationally efficient in that we bootstrap on the master machine without over-resampling, typically required by existing methods <cit.>, while provably achieving optimal statistical efficiency with minimal communication. Our method does not require repeatedly re-fitting the model but only applies multiplier bootstrap in the master machine on the gradients received from the worker machines. Simulations validate our theory.

READ FULL TEXT
research
02/19/2021

Distributed Bootstrap for Simultaneous Inference Under High Dimensionality

We propose a distributed bootstrap method for simultaneous inference on ...
research
12/21/2011

A Scalable Bootstrap for Massive Data

The bootstrap provides a simple and powerful means of assessing the qual...
research
01/31/2022

A Cheap Bootstrap Method for Fast Inference

The bootstrap is a versatile inference method that has proven powerful i...
research
04/09/2015

Robust, scalable and fast bootstrap method for analyzing large scale data

In this paper we address the problem of performing statistical inference...
research
07/13/2023

Scalable Resampling in Massive Generalized Linear Models via Subsampled Residual Bootstrap

Residual bootstrap is a classical method for statistical inference in re...
research
02/15/2023

Optimal Subsampling Bootstrap for Massive Data

The bootstrap is a widely used procedure for statistical inference becau...
research
07/04/2021

A Comparison of the Delta Method and the Bootstrap in Deep Learning Classification

We validate the recently introduced deep learning classification adapted...

Please sign up or login with your details

Forgot password? Click here to reset