Scalable Resampling in Massive Generalized Linear Models via Subsampled Residual Bootstrap

07/13/2023
by   Indrila Ganguly, et al.
0

Residual bootstrap is a classical method for statistical inference in regression settings. With massive data sets becoming increasingly common, there is a demand for computationally efficient alternatives to residual bootstrap. We propose a simple and versatile scalable algorithm called subsampled residual bootstrap (SRB) for generalized linear models (GLMs), a large class of regression models that includes the classical linear regression model as well as other widely used models such as logistic, Poisson and probit regression. We prove consistency and distributional results that establish that the SRB has the same theoretical guarantees under the GLM framework as the classical residual bootstrap, while being computationally much faster. We demonstrate the empirical performance of SRB via simulation studies and a real data analysis of the Forest Covertype data from the UCI Machine Learning Repository.

READ FULL TEXT

page 9

page 14

research
04/18/2020

Statistical inference in massive datasets by empirical likelihood

In this paper, we propose a new statistical inference method for massive...
research
08/18/2022

An Adaptively Resized Parametric Bootstrap for Inference in High-dimensional Generalized Linear Models

Accurate statistical inference in logistic regression models remains a c...
research
05/19/2020

Bootstrap prediction intervals with asymptotic conditional validity and unconditional guarantees

Focus on linear regression model, in this paper we introduce a bootstrap...
research
09/19/2023

A New Bootstrap Goodness-of-Fit Test for Normal Linear Regression Models

In this work, the distributional properties of the goodness-of-fit term ...
research
01/09/2016

On Computationally Tractable Selection of Experiments in Measurement-Constrained Regression Models

We derive computationally tractable methods to select a small subset of ...
research
10/30/2018

Mathematical modelling European temperature data: spatial differences in global warming

This paper shows an analysis of the gridded European precipitation data....
research
02/19/2020

Simultaneous Inference for Massive Data: Distributed Bootstrap

In this paper, we propose a bootstrap method applied to massive data pro...

Please sign up or login with your details

Forgot password? Click here to reset