Propensity score weighting under limited overlap and model misspecification
Propensity score (PS) weighting methods are often used in non-randomized studies to adjust for confounding and assess treatment effects. The most popular among them, the inverse probability weighting (IPW), assigns weights that are proportional to the inverse of the conditional probability of a specific treatment assignment, given observed covariates. A key requirement for IPW estimation is the positivity assumption, i.e., the PS must be bounded away from 0 and 1. In practice, violations of the positivity assumption often manifest by the presence of limited overlap in the PS distributions between treatment groups. When these practical violations occur, a small number of highly influential IPW weights may lead to unstable IPW estimators, with biased estimates and large variances. To mitigate these issues, a number of alternative methods have been proposed, including IPW trimming, overlap weights (OW), matching weights (MW), and entropy weights (EW). Because OW, MW, and EW target the population for whom there is equipoise (and with adequate overlap) and their estimands depend on the true PS, a common criticism is that these estimators may be more sensitive to misspecifications of the PS model. In this paper, we conduct extensive simulation studies to compare the performances of IPW and IPW trimming against those of OW, MW, and EW under limited overlap and misspecified propensity score models. Across the wide range of scenarios we considered, OW, MW, and EW consistently outperform IPW in terms of bias, root mean squared error, and coverage probability.
READ FULL TEXT