A Fast Bootstrap Algorithm for Causal Inference with Large Data

02/06/2023
by   Matthew Kosko, et al.
0

Estimating causal effects from large experimental and observational data has become increasingly prevalent in both industry and research. The bootstrap is an intuitive and powerful technique used to construct standard errors and confidence intervals of estimators. Its application however can be prohibitively demanding in settings involving large data. In addition, modern causal inference estimators based on machine learning and optimization techniques exacerbate the computational burden of the bootstrap. The bag of little bootstraps has been proposed in non-causal settings for large data but has not yet been applied to evaluate the properties of estimators of causal effects. In this paper, we introduce a new bootstrap algorithm called causal bag of little bootstraps for causal inference with large data. The new algorithm significantly improves the computational efficiency of the traditional bootstrap while providing consistent estimates and desirable confidence interval coverage. We describe its properties, provide practical considerations, and evaluate the performance of the proposed algorithm in terms of bias, coverage of the true 95 in a simulation study. We apply it in the evaluation of the effect of hormone therapy on the average time to coronary heart disease using a large observational data set from the Women's Health Initiative.

READ FULL TEXT

page 21

page 22

page 23

page 39

page 40

page 41

page 42

research
02/06/2021

Estimating the treatment effect for adherers using multiple imputation

Randomized controlled trials are considered the gold standard to evaluat...
research
11/11/2020

A novel method for Causal Structure Discovery from EHR data, a demonstration on type-2 diabetes mellitus

Introduction: The discovery of causal mechanisms underlying diseases ena...
research
02/14/2023

A Framework for Mediation Analysis with Massive Data

During the past few years, mediation analysis has gained increasing popu...
research
10/18/2022

Heteroscedasticity-aware sample trimming for causal inference

A popular method for variance reduction in observational causal inferenc...
research
06/11/2021

Bootstrapping Clustered Data in R using lmeresampler

Linear mixed-effects models are commonly used to analyze clustered data ...
research
07/08/2018

A Causal Bootstrap

The bootstrap, introduced by Efron (1982), has become a very popular met...

Please sign up or login with your details

Forgot password? Click here to reset