A Lasso-OLS Hybrid Approach to Covariate Selection and Average Treatment Effect Estimation for Clustered RCTs Using Design-Based Methods

05/05/2020
by   Peter Z. Schochet, et al.
0

Statistical power is often a concern for clustered RCTs due to variance inflation from design effects and the high cost of adding study clusters (such as hospitals, schools, or communities). While covariate pre-specification is the preferred approach for improving power to estimate regression-adjusted average treatment effects (ATEs), further precision gains can be achieved through covariate selection once primary outcomes have been collected. This article uses design-based methods underlying clustered RCTs to develop a Lasso-OLS hybrid procedure for the post-hoc selection of covariates and ATE estimation that avoids model overfitting and lack of transparency. In the first stage, lasso estimation is conducted using cluster-level averages, where asymptotic normality is proved using a new central limit theorem for finite population regression estimators. In the second stage, ATEs and design-based standard errors are estimated using weighted least squares with the first stage lasso covariates. This nonparametric approach applies to continuous, binary, and discrete outcomes. Simulation results indicate that Type 1 errors of the second stage ATE estimates are near nominal values and standard errors are near true ones, although somewhat conservative with small samples. The method is demonstrated using data from a large, federally funded clustered RCT testing the effects of school-based programs promoting behavioral health.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2019

Principled estimation of regression discontinuity designs with covariates: a machine learning approach

The regression discontinuity design (RDD) has become the "gold standard"...
research
11/19/2020

A general theory of regression adjustment for covariate-adaptive randomization: OLS, Lasso, and beyond

We consider the problem of estimating and inferring treatment effects in...
research
09/23/2021

Blocking, rerandomization, and regression adjustment in randomized experiments with high-dimensional covariates

Blocking, a special case of rerandomization, is routinely implemented in...
research
05/03/2021

Reconciling design-based and model-based causal inferences for split-plot experiments

The split-plot design assigns different interventions at the whole-plot ...
research
12/04/2018

Estimating cluster-level local average treatment effects in cluster randomised trials with non-adherence

Non-adherence to assigned treatment is a common issue in cluster randomi...
research
12/24/2020

Dependence of variance on covariate design in nonparametric link regression

This note discusses a nonparametric approach to link regression aiming a...
research
05/31/2021

Regression-Adjusted Estimation of Quantile Treatment Effects under Covariate-Adaptive Randomizations

This paper examines regression-adjusted estimation and inference of unco...

Please sign up or login with your details

Forgot password? Click here to reset