Variance estimation in pseudo-expected estimating equations for missing data
Missing data is a common challenge in biomedical research. This fact, along with growing dataset volumes of the modern era, make the issue of computationally-efficient analysis with missing data of crucial practical importance. A general computationally-efficient estimation framework for dealing with missing data is the pseudo-expected estimating equations (PEEE) approach. The method is applicable with any parametric model for which estimation involves the solution of a set of estimating equations, such as likelihood score equations. A key limitation of the PEEE approach is that there is currently no closed-form variance estimator, and variance estimation requires the computationally burdensome bootstrap method. In this work, we address the gap and provide a closed-form variance estimator whose computation can be significantly faster than a bootstrap approach. Our variance estimator is shown to be consistent even with auxiliary variables and under misspecified models for the incomplete variables. Simulation studies show that our variance estimator performs well and that its computation can be over 50 times faster than the bootstrap. The computational efficiency gain from our proposed variance estimator is crucial with large datasets or when the main analysis method is computationally intensive. Finally, the PEEE approach along with our variance estimator are used to analyze incomplete electronic health record data of patients with traumatic brain injury.
READ FULL TEXT