The covariate-adjusted residual estimator and its use in both randomized trials and observational settings

10/24/2019
by   Stephen A. Lauer, et al.
0

We often seek to estimate the causal effect of an exposure on a particular outcome in both randomized and observational settings. One such estimation method is the covariate-adjusted residuals estimator, which was designed for individually or cluster randomized trials. In this manuscript, we study the properties of this estimator and develop a new estimator that utilizes both covariate adjustment and inverse probability weighting We support our theoretical results with a simulation study and an application in an infectious disease setting. The covariate-adjusted residuals estimator is an efficient and unbiased estimator of the average treatment effect in randomized trials; however, it is not guaranteed to be unbiased in observational studies. Our novel estimator, the covariate-adjusted residuals estimator with inverse probability weighting, is unbiased in randomized and observational settings, under a reasonable set of assumptions. Furthermore, when these assumptions hold, it provides efficiency gains over inverse probability weighting in observational studies. The covariate-adjusted residuals estimator is valid for use in randomized trials, but should not be used in observational studies. The covariate-adjusted residuals estimator with inverse probability weighting provides an efficient alternative for use in randomized and observational settings.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/30/2021

Estimating the Efficiency Gain of Covariate-Adjusted Analyses in Future Clinical Trials Using External Data

We present a general framework for using existing data to estimate the e...
08/25/2021

Nonparametric identification is not enough, but randomized controlled trials are

We argue that randomized controlled trials (RCTs) are special even among...
06/07/2018

Unbiased Estimation of the Value of an Optimized Policy

Randomized trials, also known as A/B tests, are used to select between t...
07/19/2021

Causal Inference Struggles with Agency on Online Platforms

Online platforms regularly conduct randomized experiments to understand ...
07/15/2021

Covariate adjustment in randomised trials: canonical link functions protect against model mis-specification

Covariate adjustment has the potential to increase power in the analysis...
12/05/2021

A Robust, Differentially Private Randomized Experiment for Evaluating Online Educational Programs With Sensitive Student Data

Randomized control trials (RCTs) have been the gold standard to evaluate...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Causal Roadmap

Did bednets reduce childhood mortality in Ghana? This is a common structure for a causal scientific question: how would a change in an exposure (e.g. bednets) change an outcome (e.g. childhood mortality). As a result, answering causal questions require a different approach than descriptive or associative questions. For example, a descriptive analysis may provide point and uncertainty estimates for the childhood mortality in clusters that actually received bednets and in clusters that did not receive bednets. If we were interested in predicting childhood mortality, we would want to know whether including a covariate for bednets added value to predictions that may use other information (such as age and sex), regardless of whether that relationship was causal or associative. Answering the causal question requires a deeper understanding of the system that generates the exposure and the outcome, as well as the influence of additional covariates.

Several conceptual and analytic frameworks exist and can guide our answering of causally motivated questions.[Neyman1923, Rubin1974, Robins1986, Rubin1990, Holland1986, Spirtes1993, Pearl2009, Robins2009, Richardson2013swig, Petersen2014, balzer_tutorial_2016, Hernan2016]. Here, we review the Causal Roadmap of [Petersen2014] and use the bednet example for illustration. The key steps of the Causal Roadmap are representing knowledge of the data generating process (represented by a causal model); specifying the quantity that answers the scientific question (the causal parameter); evaluating the assumptions required to link the causal quantity to a well-defined function of the observed data distribution (the statistical parameter which may or may not be identifiable); and finally obtaining a point estimate and inference of the statistical parameter.

Figure 1: Causal diagrams for randomized trials (a) and observational studies (b). These diagrams give a visual representation of the relationships between the variables in a causal model. Arrows are drawn from a potential cause to an effect. In a completely randomized trial setting, the exposure is generated independently of all other variables and the outcome may be influenced by both the exposure and a set of baseline covariates . In an observational setting, the exposure is no longer randomized, but instead is influenced by baseline covariates. Some of these covariates also influence , thus confounding the relationship between the exposure and the outcome . Other covariates only influence the exposure and not the outcome ; as before some covariates only influence the outcome and not the exposure . (For simplicity, other unmeasured sources of variation are omitted; see the Supplemental Material for a complete graph).

A causal model is a structural framework for expressing the relationships between variables in a given setting.[Pearl1988, Pearl2009, Pearl2010, Goldberger1972, Duncan1975] A causal model can be expressed graphically as a diagram, where variables are connected by edges (arrows) that originate at a potential cause and terminate at the effect. Figure 1a is a diagram representing a randomized trial, like that of our case study, where is a binary exposure ( if the cluster received the bednet intervention, if the cluster did not) and is the set of baseline covariates (the average age and percentage of children who are female for each cluster) that may influence the outcome (childhood mortality). There are no edges pointing to the exposure because the randomization procedure makes the allocation of exposure independent of all other covariates. (A diagram including unmeasured variables is depicted in the Supplementary Materials.) For this experimental setting, we assume that this causal model describes the data generating process for each cluster and that clusters are causally independent (i.e. the outcome for one cluster is only influenced by that cluster’s exposure and baseline covariates and independent of other clusters’ exposures, baseline covariates, and outcomes.)

In an observational setting (as portrayed in Figure 1b), the allocation to the exposure is not randomized and is potentially influenced by the baseline covariates. In addition to the covariates that influence the outcome , but not the exposure , there are two new subsets of covariates. One subset of covariates only influence the exposure , but not the outcome . The other subset are confounding covariates that influence both the exposure and the outcome and thus obscure the isolation of the causal effect of interest. As a running example, suppose bednets are distributed to clusters by the determination of local health officials instead of at random. In this scenario, consider a new baseline covariate: mosquito abundance. Places with greater mosquito abundance prior to the intervention may be higher risk for infectious diseases and future childhood mortality and public health officials would want to concentrate their efforts in these areas. Thus mosquito abundance is a confounding covariate, as it is a common cause of both the outcome and the exposure.

With the causal model specified, we can translate our scientific question into a causal parameter. We assume that the relationships within the causal model are autonomous, meaning that changing one relationship does not change the other relationships, although changes may result in different effects downstream.[Pearl2009, Pearl2010] Therefore, we could modify the way in which the exposure is generated and see resulting changes in the outcome. Specifically, we could intervene to give impregnated bednets to a cluster (i.e. set ) and generate the counterfactual (potential) outcome for that cluster. Likewise, we could intervene to put the same cluster in the unexposed group (i.e. set and generate the counterfactual (potential) outcome for that cluster. With these counterfactual outcomes, we translate this scientific question into a well-defined causal quantity, specifically the average treatment effect (ATE):

(1)

The ATE is the expected difference in the average childhood mortality rate if all of the clusters in our target population received impregnated bednets () and if none of the clusters received impregnated bednets (). We cannot directly estimate this parameter because we only observe the outcomes corresponding to the actual exposures and not both counterfactual outcomes. Thus, we need to outline the conditions and assumptions necessary to identify the causal parameter as a statistical parameter of the observed data distribution.

In an observational setting, suppose our observed data consist of confounding covariates , the exposure indicator , and the outcome ; we denote the observed data as and assume we have independent, identically distributed (i.i.d.) copies from some distribution . In this setting, we use the difference in conditional expectations between the exposed and unexposed, adjusted for and averaged across the measured confounding covariates, as the statistical parameter:

(2)

This statistical parameter is known as the “G-computation identifiability result”, which identifies the ATE under two assumptions.[Robins1986] First, there must be no unmeasured confounding between the exposure and the outcome: . Second, the ‘positivity assumption’, which states that each possible strata of measured confounding covariates has a non-zero probability of being in each exposure group (), must hold.[Petersen2010] In our example, the statistical parameter is the expected difference in the childhood mortality rate between clusters with and without bednets adjusting for common causes.

In a randomized trial, the process for allocating units to the exposure is truly random (i.e. a coin flip). Therefore, the assumptions of no unmeasured confounding and positivity are satisfied naturally by the study design. Therefore, in this setting, we can identify the ATE with the target statistical parameter , the difference in the expectation of the outcome between exposure groups:

(3)

The statistical parameter can be consistently estimated using the difference in the average outcome between the exposure groups, also known as the ‘unadjusted estimator’:[Neyman1923]

(4)

where is the empirical distribution; is the number of units () in exposure level , and is an estimate of the expected outcome in exposure group . In the bednet cluster randomized trial, the unadjusted estimate is the difference between the average mortality rate for clusters assigned to receive bednets and the average mortality rate for clusters not assigned to receive bednets, as illustrated in Figure 2a.

Figure 2: Estimates of the effect of bednets on childhood mortality rate using different methods; data obtained from [Hayes2009]. In all plots, the clusters are arranged by the average age of the children in the cluster (x-axis); blue triangles denote clusters assigned to receive bednets (; intervention), and orange circles denote clusters not assigned to receive bednets (; control). Despite randomization, there were imbalances in baseline covariates predictive of the outcome () between arms. (a) The unadjusted estimate is the difference in the average child mortality rate (), per thousand follow-up years, between intervention clusters (dashed blue line) and control clusters (solid orange line). (b) The inverse probability weighting (IPW) estimator gives greater weight to intervention clusters with higher average ages and control clusters with lower average ages, as indicated by the size of the triangles and circles. The point estimate from IPW is the difference in the average of the weighted outcomes between groups. (c) The covariate-adjusted residuals estimator (CARE) uses predictions () of the child mortality rates based on Poisson regression with average age and proportion female as covariates. The point estimate from CARE is the difference in the average residuals between intervention clusters and control clusters. (d) The covariate-adjusted residuals estimator with inverse probability weighting (CARE–IPW) combines the weights from the IPW estimator with the residuals from CARE. Specifically, the point estimate from CARE–IPW is the difference in the average of the weighted residuals between groups.

In observational settings, the exposure allocation is no longer random, and we have to account for common causes of the exposure and outcome . If captures all those common causes (i.e. there is no unmeasured confounding) and there is sufficient variability in the exposure within possible strata of (i.e. positivity holds), then the inverse probability weighting (IPW) estimator can be used to estimate the statistical parameter :[Horvitz1952, Robins2000]

(5)

which controls for the confounding covariates through estimates of the conditional probability of receiving the exposure, called ‘propensity scores’ .[Rosenbaum1983] Intuitively, the IPW estimator up-weights exposure-covariate combinations that are rare, relative to a randomized trial, and down-weights exposure-covariate combinations that are more common, relative to a randomized trial. If the propensity scores are consistent for the true conditional probability of exposure given common causes , the IPW estimator is consistent for the statistical parameter . Notably, the propensity scores do not need to account for other covariates that only influence the exposure .

IPW can also be used in randomized trials, and in that setting estimating the known propensity score can lead to efficiency gains over the unadjusted estimator.[vanderLaan2003, Shen2014, Balzer2016]

In the bednet cluster randomized example, we used a logistic regression with independent variables for average age and the proportion of children that are female to estimate the probability of bednet assignment for each cluster

. Despite randomization, we found that the estimated propensity scores ranged from 0.22 to 0.72 and were higher for clusters with lower average age than for clusters with higher average age. In the study, younger clusters were, by chance, more likely to receive the bednet intervention than older clusters. When obtaining a point estimate by taking the average difference in weighted outcomes, intervention clusters with higher average ages were assigned greater weight, as are control clusters with lower average ages (Figure 2b).

2 The covariate-adjusted residuals estimator (CARE)

The covariate-adjusted residuals estimator (CARE) was proposed as a method to estimate the ATE in randomized trials.[Gail1996, Bennett2002, Hayes2009] First, the outcome is predicted using only the baseline covariates and not the exposure , giving us predicted values . Next, residuals are derived as the difference between the observed outcome and the predicted one. Finally, the difference in the average residuals between exposure groups provides a point estimate:

(6)
(7)

where the number of units at each exposure level is equal to the total number of units times the empirical probability of exposure . To obtain the predictions of the outcome in the absence of the exposure , [Hayes2009] recommend using Poisson regression for event rates, logistic regression for binary outcomes, and linear regression for continuous outcomes.

In the bednet cluster randomized example, we used Poisson regression to estimate the child mortality rate for each cluster with independent variables for average age and the proportion of children that are female, but not for bednet assignment. The point estimate from CARE is then the difference between the average residual for intervention clusters and the average residual for control clusters (Figure 2c)

Theorem 2.1 In a randomized trial, the covariate adjusted residuals estimator (CARE) is an unbiased estimator of the target statistical parameter and thus the average treatment effect (ATE) because the identifiability assumptions hold by design.

The proof is given in Appendix A. Briefly, consider the following estimating function of the observed data and parameter : [Rose2011, Kennedy2017]

(8)

is an unbiased estimating function for in that when its expectation is zero: (proof in Appendix A). The corresponding estimating equation is given by

and we obtain a point estimate from CARE by solving this estimating equation. In other words, is the solution satisfying , as shown in Equation 7.

CARE requires estimation of both the marginal probability of the exposure and conditional expectation of the outcome, given the covariates . Since the exposure mechanism is always consistently estimated in randomized trials, CARE will be consistent for in a trial setting. Under regularity conditions,[Rose2011, Kennedy2017]

the Central Limit Theorem applies, and CARE is asymptotically normal with variance well-approximated by the sample variance of

divided by sample size

. This simple variance estimator can be used for construction of Wald-Type 95% confidence intervals and testing the null hypothesis.

When accounting for predictive covariates that are imbalanced by chance between the two randomized groups, we expect CARE to provide efficiency gains over the unadjusted estimator. [Fisher1932, Cochran1957, Cox1982, Tsiatis2008, Moore2009, Rosenblum2010, Balzer2016] When the predicted values of the outcome are a constant value (e.g. zero or the mean of all observations ), then CARE is equivalent to the unadjusted estimator (proof in Appendix A). Thus, the unadjusted estimator could be considered a special case of CARE.

Theorem 2.2 In most observational settings of interest, the covariate adjusted residuals estimator (CARE) is a biased estimator of the target statistical parameter .

The proof is given in Appendix B. Briefly, the expectation of the CARE estimating function (Equation 8) for is

For this expectation to be zero, we would need the residual difference between the “unadjusted” estimand and the “adjusted” estimand to be captured by the average difference in the weighted predictions in the absence of the exposure . There is no reason to believe this would generally be the case. Instead, we can only expect this to hold under the strong null, where and . When the null is false (i.e. there is an exposure effect), there might also be scenarios where we get some bias cancellation, but this cannot be proven under a non-parametric statistical model. Thus, CARE is only guaranteed to be a consistent estimator of when the null is true and conditional mean outcome is consistently estimated. Since we do not know a priori whether or not the null hypothesis is true (which is presumably why we are trying to estimate the exposure effect), we do not recommend that CARE be used in observational settings.

3 Improving upon CARE with inverse probability weighting

As previously discussed, the inverse probability weighting (IPW) estimator controls for measured confounders by up-weighting observations that have a rare exposure-covariate combination (relative to a randomized trial) and down-weighting those with a common exposure-covariate combination (again relative to a randomized trial) using the estimated propensity scores . This suggests that we may improve CARE (Equation 7) for use in observational settings by replacing the empirical probabilities of exposure with estimated propensity scores . To our knowledge, such an estimator has not been previously proposed or explored, and thus we name it “covariate-adjusted residuals estimator with inverse probability weighting” (CARE–IPW):

(9)

The CARE–IPW estimate is the difference in the weighted average of the residuals for the intervention group and for the control group.

The CARE–IPW estimator can also be applied to randomized trials by replacing the confounding covariates , which are not present in randomized trials, with the covariates that affect the outcome . In the bednet cluster randomized trial example, we used Poisson regression to estimate the child mortality rate for each cluster , as in the CARE estimator, and a logistic regression to estimate the propensity scores for bednet assignment , as in the IPW estimator (Figure 2d).

Theorem 3.1 In an observational setting, the covariate-adjusted residuals estimator with inverse probability weighting (CARE–IPW) is an unbiased estimator of the target statistical parameter . In randomized trials where the identifiability assumptions hold by design and , CARE–IPW is an unbiased estimator of the average treatment effect (ATE).

The proof is given in Appendix C. Briefly, consider the following estimating function of the observed data and parameter :

(10)

is an unbiased estimating function for in that when its expectation is zero: (proof in Appendix C). The corresponding estimating equation is given by

We obtain a point estimate from CARE–IPW by solving this estimating equation. In other words, is the solution satisfying , as shown in Equation 9.

CARE–IPW requires estimation of both the propensity score and conditional expectation of the outcome, given the covariates . CARE–IPW is consistent when the propensity score is consistently estimated, or when the null is true and conditional mean outcome is consistently estimated. Under regularity conditions,[Rose2011, Kennedy2017] the Central Limit Theorem applies, and CARE–IPW is asymptotically normal with variance well-approximated by the sample variance of divided by sample size . By predicting the outcome with the covariates , we expect CARE-IPW to provide efficiency gains over the IPW estimator.

Each of the estimators described in this paper can be characterized as special cases of CARE–IPW. CARE–IPW reduces to CARE when the propensity scores are estimated with the empirical probability of exposure. CARE–IPW reduces to IPW when the predicted values of the outcome are all zero. CARE–IPW reduces to the unadjusted estimator when both of the above conditions are met.

4 Simulation Study

In this section, we use a simulation to compare the performance of the unadjusted estimator, IPW, CARE, and CARE–IPW in a randomized trial as well as an observational setting. We consider a synthetic data-generating process with a binary exposure and a binary outcome. In the randomized setting, three covariates affect the outcome. In the observational setting, the relationship between the exposure and the outcome is confounded by two covariates. In both settings, there is no unmeasured confounding and positivity holds by design; estimates can, therefore, be interpreted causally.

We compare the performance of the estimators using bias, Monte Carlo standard error, average standard error estimate, confidence interval coverage, power, and type I error. Let

denote the point estimate in simulation , . Bias is the average difference between the point estimate and the statistical parameter , where is in the observational setting and in the randomized setting. Monte Carlo standard error is the standard error of the point estimates across simulations . The average standard error estimate is the mean of the estimated standard errors across simulations , where is the influence curve-based estimate of the variance in simulation . Confidence interval coverage is the proportion of 95% confidence intervals that covered the statistical parameter across all simulations. Power is the proportion of simulations that the estimator rejected the null hypothesis of no exposure effect when there was an exposure effect. Type I error is the proportion of simulations that the estimator rejected the null hypothesis of no exposure effect when the null hypothesis was true.

All simulations were run using R version 3.4.3.[R2018] Simulations were run in parallel on 15 cores on a remote server. To maintain reproducibility and to make sure that the same samples were drawn for each scenario (with and without an effect in randomized and observational settings) across simulations, we set a seed for the random number generator for each simulation based on the simulation number. The code used for this project can be found at https://doi.org/10.5281/zenodo.3517241.

4.1 Setup

To study the finite sample properties of CARE and CARE–IPW relative to those of the unadjusted and IPW estimators, we designed a synthetic simulation with binary exposures and outcomes.

Consider an experiment with 96 units. For each unit in the sample, we generate four independent baseline covariates: , , , and . We simulate a randomized trial where the exposure is assigned with probability 0.5 as well as an observational setting where the exposure is assigned with a probability given by . Each unit’s counterfactual outcomes, and , are generated as

by deterministically setting the exposure to and , respectively. The average treatment effect is calculated by taking the mean difference in the counterfactual outcomes for a population of 100,000 units. We also simulate a scenario under the null hypothesis of no exposure effect by setting the counterfactual outcome with the exposure equal to the counterfactual outcome without the exposure .

We implement the unadjusted estimator as the difference in average outcomes between exposed and unexposed units (Equation 4). When estimating the propensity score, required for IPW and CARE–IPW, we use a logistic regression with main terms for and , which are the confounders in the observational setting. For the outcome prediction, which is required for CARE and CARE–IPW, we use a logistic regression with main terms for , , and , which corresponds to the correctly specified regression under the null.

4.2 Results

Table 1 provides a comparison of the performance of the estimators over =5,000 repetitions of the simulation. When there is an effect, the intervention led to a -28.1% average reduction in the outcome.

All estimators are unbiased in the randomized trial setting. The 95% confidence interval coverage for each algorithm is close to or above the nominal level. Improvements in Monte Carlo standard error, average standard error, and statistical power over both the unadjusted and the IPW estimators are achieved by both CARE and CARE–IPW.

Trial Exposure Estimator Bias MC SE Average SE 95% CI coverage Power/ Type I error
RCT Effect CARE–IPW 0.003 0.092 0.092 94.5% 85.4%
CARE 0.008 0.090 0.090 94.4% 85.3%
IPW -0.001 0.094 0.148 99.7% 46.2%
Unadj -0.002 0.101 0.101 94.3% 78.1%
RCT Null CARE–IPW 0.000 0.093 0.090 94.1% 5.9%
CARE 0.000 0.091 0.089 94.4% 5.6%
IPW 0.000 0.095 0.167 99.9% 0.1%
Unadj -0.000 0.104 0.103 94.4% 5.6%
Obs Effect CARE–IPW 0.000 0.115 0.115 94.5% 71%
CARE 0.062 0.082 0.081 87.4% 75.6%
IPW -0.005 0.126 0.164 98.7% 44.6%
Unadj -0.197 0.088 0.089 41.7% 100%
Obs Null CARE–IPW -0.004 0.107 0.102 94.1% 5.9%
CARE -0.003 0.079 0.087 96.7% 3.3%
IPW -0.005 0.124 0.197 99.6% 0.4%
Unadj -0.219 0.100 0.099 39.7% 60.3%
Table 1: Results for the estimators for the simulation by trial type and exposure effect. The covariate-adjusted residuals estimator (CARE) uses a logistic regression with , , and to predict the outcome. The inverse probability of weighting (IPW) estimator uses a logistic regression with and to estimate the propensity scores. CARE with inverse probability weighting (CARE–IPW) the same regression as CARE to predict the outcome and the same regression as IPW to estimate the propensity scores.

When there is an effect in the observational setting, the unadjusted estimator is markedly biased with low confidence interval coverage: 41.7%. By adjusting for confounders when predicting the outcome, CARE reduces but does not eliminate bias and achieves confidence interval coverage of 87.4%, still much less than the nominal level. Through consistent estimation of the propensity score and thereby control for the confounders, both the IPW estimator and CARE–IPW are unbiased and achieve nominal to conservative confidence interval coverage. CARE–IPW is more efficient and achieves higher statistical powerful than the IPW estimator: 71.0% vs. 44.6%, respectively.

When there is no effect in an observational setting, the unadjusted estimator is again biased with low confidence interval coverage at 39.7%. Both CARE, and CARE–IPW are unbiased with nominal to conservative Type I error control and greater precision than the IPW estimator. We note that under the null, CARE is expected to be consistent if the outcome is correctly predicted, which it was here.

Altogether this simulation confirms the theoretical properties described in Sections 2-3.

5 Case study

In this section, we first reproduce the findings of [Hayes2009], who compared CARE to the unadjusted estimator in the cluster randomized trial to estimate the impact of impregnated bednets on child mortality in northern Ghana.[Bennett2002, Binka1996] Then we apply the IPW estimator and CARE–IPW on the same data and discuss the results.

5.1 Setup

In the original analysis, the researchers estimated the unadjusted and covariate-adjusted mortality rates for the exposed and unexposed groups and compared them using the t-test. For the unadjusted estimator, the observed mortality rate (i.e. the number of deaths per thousand follow-up years) was calculated for each cluster. The unadjusted estimate of the ATE was equal to the difference in the average observed mortality rate between randomized arms (Figure 2a).

In the covariate-adjusted analysis, the researchers used a Poisson regression for mortality rate on the individual-level data using age and sex as covariates, but not the cluster intervention assignment. From this regression, they predicted the mortality rate per follow-up year for each child, which they then aggregated into cluster-level predicted mortality rates per thousand follow-up years. The researchers found the residuals by taking the difference between the observed and predicted mortality rates for each cluster. The CARE estimate of the ATE was equal to the difference in the average of the residuals between randomized arms (Figure 2c). Hayes and Moulton used a -test to generate confidence intervals and conduct hypothesis testing.

We reproduce this analysis and extend it to include IPW and CARE–IPW (Figure 2b,d). While our point estimates of the ATE for CARE and the unadjusted estimator are identical to those in Hayes and Moulton, we estimate the variance using influence curve-based methods (Sections 2-3), which yield slightly different confidence intervals and p-values. The IPW and CARE–IPW estimators require propensity scores, which we estimate with a main terms logistic regression for the exposure at the cluster-level using average age in months and percent of children who are female as covariates. For CARE–IPW, we use the same predicted values of the outcome from the individual-level regression as used for CARE.

5.2 Results

The IPW and CARE–IPW estimates of the exposure effect are larger than the estimates from the unadjusted estimator or CARE (Figure 3). As in the original analysis, we estimate a mortality rate difference between the exposed group and the unexposed group of -3.95 (95% CI: -8.46, 0.56; p-value = 0.09) per thousand follow-up years using the unadjusted estimator and -4.26 (95% CI: -8.67, 0.15; p-value = 0.06) per thousand follow-up years using CARE. Using IPW, the estimated mortality rate difference is -5.37 (95% CI: -16.94, 6.2; p-value = 0.36) per thousand follow-up years. For CARE–IPW the mortality rate difference is -5.08 (95% CI: -9.46, -0.7; p-value = 0.02) per thousand follow-up years. As in the simulation study, the standard error estimates for the CARE and CARE–IPW are less than those of the unadjusted and IPW. While IPW had the largest estimated effect size, it also had the largest estimated variance and thereby widest confidence intervals of any estimator. The estimate made by CARE–IPW was larger than either the unadjusted estimator or CARE and had the smallest variance estimate.

Figure 3: The estimates and 95% confidence intervals for the effect of allocating bednets on childhood mortality rate per thousand follow-up years; data obtained from [Hayes2009]. The four algorithms are the unadjusted estimator, the covariate-adjusted residuals estimator (CARE), the inverse probability weighting estimator (IPW), and CARE with inverse probability weighting (CARE–IPW). While all estimates indicate that bednets cause a reduction in childhood mortality, the CARE and CARE–IPW estimates are more precise than those of the unadjusted and IPW estimators.

6 Discussion

In this paper, we (1) provide non-parametric statistical theory for the covariate-adjusted residuals estimator (CARE) in randomized and observational settings, (2) propose a novel estimator, the covariate-adjusted residuals estimator with inverse probability weighting (CARE–IPW), and (3) support theoretical results with a simulation study and an application to a cluster randomized trial. Specifically, we prove that CARE is consistent for the average treatment effect (ATE) in randomized studies. We also prove that CARE is not consistent for the ATE in most observational settings of interest (e.g. when there is an exposure effect). We develop a new estimator, CARE–IPW, which is consistent for the ATE in observational settings when the propensity scores are consistently estimated.

The simulation study supports our theoretical findings and suggests some advantages to using CARE–IPW rather than CARE or the IPW estimator. In randomized trials, CARE and CARE–IPW achieved greater precision and statistical power, compared to the other estimators, and maintained nominal confidence interval coverage. In observational settings, CARE–IPW is consistent for the statistical parameter when accounting for the confounding covariates in the propensity score model and has greater statistical power and less variability than the IPW estimator. CARE is biased in observational settings with an exposure effect.

While CARE–IPW improves on CARE and IPW, it is not a “double robust estimator”, such as targeted maximum likelihood estimation and augmented inverse probability weighting.[robins2000robust, vanderLaan2003, VanderLaan2006, MarkBook] A double robust estimator is consistent for if either the outcome predictions (which often include the exposure as well as baseline covariates) or the propensity scores is consistently estimated and is the most efficient estimator if both are. Similar to the IPW estimator, CARE–IPW is consistent for if and only if the propensity score is consistently estimated. By incorporating predictions of the outcome, CARE–IPW is expected to be more efficient than IPW.

One advantage to using CARE–IPW rather than another method is that researchers do not need to specify the relationship between the exposure and the outcome. This can be beneficial when there is a complex relationship between the exposure and outcome, such as multiple non-linear interactions with other covariates that augment the strength of the exposure.

As with the IPW estimator, CARE–IPW may have stability issues when estimated propensity scores approach zero or one.[Petersen2010] This could be resolved in one of a couple ways. Stabilized weights could be used to scale propensity scores away from zero and one.[Robins2000] Alternatively, propensity scores could be replaced by incremental propensity scores which relax the positivity assumption by looking at the effect of an intervention when propensity scores are uniformly increased and decreased across all observations.[Kennedy2018]

The findings of this paper suggest that CARE is suitable for use in estimating the average treatment effect in randomized trials, but not in observational settings. As an alternative to CARE, CARE–IPW has potential for use as an estimator in observational settings; further research is warranted.

Appendix A Proof of Theorem 2.1

Suppose we are in a trial setting. Let denote the covariates that are predictive of the outcome, be a binary indicator of receiving the exposure, and be the outcome. We assume that we have independent, identically distributed copies of observed data with some distribution

. In the following, we assume discrete random variables for simplicity; however, all summations generalize to integrals for continuous random variables. In a randomized trial, our target statistical estimand is

.

The expectation of the CARE estimating function (Equation 8) is

The first component of the expectation is equal to the target parameter in a randomized trial :

The second component of the expectation is 0:

In a randomized trial, we have , and thus the second component is zero.

Thus, when our parameter of interest is , the expectation of the CARE estimating equation

This proves that in a trial setting the CARE estimating function is unbiased for the statistical parameter , which identifies the average treatment effect because identifiability assumptions hold by design.

Corollary 2.1.1: If the predicted outcome is a constant (e.g. 0 or the sample average outcome), CARE reduces to the unadjusted difference in mean outcomes.

Proof: Denote the predicted outcome with a constant , and let and denote the number of treated and control units, respectively. Then we have

Appendix B Proof of Theorem 2.2

Suppose we are in an observational setting. Let denote the confounding covariates, be binary an indicator of receiving the exposure, and be the outcome. We assume that we have independent, identically distributed copies of observed data with some distribution . In the following, we assume discrete random variables for simplicity; however, all summations generalize to integrals for continuous random variables. In an observational setting, our target statistical estimand is .

Using the same steps as in the Proof for Theorem 2.1 (Appendix A), but replacing the predictive covariates with the confounding covariates , the expectation of the CARE estimating function (Equation 8) in an observational setting is given by

where and where, due to confounding, . When our parameter of interest is , the expectation is generally not zero.

Under a non-parametric statistical model, we can only guarantee the expectation is zero under the strong null, where and :

When the null is false, there might also be some scenarios when , but this cannot be proven under a non-parametric statistical model. Thus, we conclude that the CARE estimating function is generally not an unbiased for , even if the covariates , which are sufficient to control for confounding, are used to predict the outcome.

Appendix C Proof of Theorem 3.1

Suppose we are in an observational setting. Let denote the confounding covariates, be binary an indicator of receiving the exposure, and be the outcome. We assume that we have independent, identically distributed copies of observed data with some distribution . In the following, we assume discrete random variables for simplicity; however, all summations generalize to integrals for continuous random variables. In an observational setting, our target statistical estimand is .

The expectation of the CARE–IPW estimating function (Equation 10) is

The first component of the expectation is equivalent to the IPW estimand and equal to the target parameter :

The second component of the expectation is 0:

When our parameter of interest is , the expectation of the CARE–IPW estimating function is

This proves that in an observational setting the CARE–IPW estimating function is unbiased for the statistical parameter , which identifies the average treatment effect under the assumptions outlined in Section 1. In a randomized trial, we have , and thus the CARE-IPW estimating function is also unbiased when the exposure mechanism is known.

Acknowledgements

We thank Mark van der Laan for his expert advice.

This project was funded by NIH NIAID grant 1R01AI102939 and NIGMS grant R35GM119582. The findings and conclusions in this manuscript are those of the authors and do not necessarily represent the views of the National Institutes of Health or the National Institute of General Medical Sciences. The funders had no role in study design, data collection and analysis, decision to present, or preparation of the presentation.

References