Treatment effect bias from sample snooping: blinding outcomes is neither necessary nor sufficient
Popular guidance on observational data analysis states that outcomes should be blinded when determining matching criteria or propensity scores. Such a blinding is informally said to maintain the "objectivity" of the analysis (Rubin et al., 2008). To explore these issues, we begin by proposing a definition of objectivity based on the worst-case bias that can occur without blinding, which we call "added variable bias." This bias is indeed severe, and can diverge towards infinity as the sample size grows. However, we also show that bias of the same order of magnitude can occur even if the outcomes are blinded, so long as some prior knowledge is available that links covariates to outcomes. Finally, we outline an alternative sample partitioning procedure for estimating the average treatment effect on the controls, or the average treatment effect on the treated, while avoiding added variable bias. This procedure allows for the analysis to not be fully prespecified; uses all of the the outcome data from all partitions in the final analysis step; and does not require blinding. Together, these results illustrate that outcome blinding is neither necessary nor sufficient for preventing added variable bias, and should not be considered a requirement when evaluating novel causal inference methods.
READ FULL TEXT