On the implied weights of linear regression for causal inference
In this paper, we derive and analyze the implied weights of linear regression methods for causal inference. We obtain new closed-form, finite-sample expressions of the weights for various types of estimators based on multivariate linear regression models. In finite samples, we show that the implied weights have minimum variance, exactly balance the means of the covariates (or transformations thereof) included in the model, and produce estimators that may not be sample bounded. Furthermore, depending on the specification of the regression model, we show that the implied weights may distort the structure of the sample in such a way that the resulting estimator is biased for the average treatment effect for a given target population. In large samples, we demonstrate that, under certain functional form assumptions, the implied weights are consistent estimators of the true inverse probability weights. We examine doubly robust properties of regression estimators from the perspective of their implied weights. We also derive and analyze the implied weights of weighted least squares regression. The equivalence between minimizing regression residuals and optimizing for certain weights allows us to bridge ideas from the regression modeling and causal inference literatures. As a result, we propose a set of regression diagnostics for causal inference. We discuss the connection of the implied weights to existing matching and weighting approaches. As special cases, we analyze the implied weights in common settings such as multi-valued treatments, regression after matching, and two-stage least squares regression with instrumental variables.
READ FULL TEXT