Regularizing Double Machine Learning in Partially Linear Endogenous Models
We estimate the linear coefficient in a partially linear model with confounding variables. We rely on double machine learning (DML) and extend it with an additional regularization and selection scheme. We allow for more general dependence structures among the model variables than what has been investigated previously, and we prove that this DML estimator remains asymptotically Gaussian and converges at the parametric rate. The DML estimator has a two-stage least squares interpretation and may produce overly wide confidence intervals. To address this issue, we propose the regularization-selection regsDML method that leads to narrower confidence intervals. It is fully data driven and optimizes an estimated asymptotic mean squared error of the coefficient estimate. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method will be made available in the R-package dmlalg.
READ FULL TEXT