Calibrated inference: statistical inference that accounts for both sampling uncertainty and distributional uncertainty
During data analysis, analysts often have to make seemingly arbitrary decisions. For example during data pre-processing there are a variety of options for dealing with outliers or inferring missing data. Similarly, many specifications and methods can be reasonable to address a domain question. This may be seen as a hindrance for reliable inference, since conclusions can change depending on the analyst's choices. In this paper, we argue that this situation is an opportunity to construct confidence intervals that account not only for sampling uncertainty but also some type of distributional uncertainty. The distributional uncertainty model is related to a variety of potential issues with the data, ranging from dependence between observations to selection bias and confounding. The final procedure can be seen as an extension of current statistical practice. However, the underlying assumptions are quite different. Standard statistical practice often relies on the i.i.d. assumption. We rely on a strictly weaker symmetry assumption stating that the empirical distribution and the target distribution differ by an isotropic distributional perturbation.
READ FULL TEXT