Double machine learning for sample selection models

11/30/2020
by   Michela Bia, et al.
0

This paper considers treatment evaluation when outcomes are only observed for a subpopulation due to sample selection or outcome attrition/non-response. For identification, we combine a selection-on-observables assumption for treatment assignment with either selection-on-observables or instrumental variable assumptions concerning the outcome attrition/sample selection process. To control in a data-driven way for potentially high dimensional pre-treatment covariates that motivate the selection-on-observables assumptions, we adapt the double machine learning framework to sample selection problems. That is, we make use of (a) Neyman-orthogonal and doubly robust score functions, which imply the robustness of treatment effect estimation to moderate regularization biases in the machine learning-based estimation of the outcome, treatment, or sample selection models and (b) sample splitting (or cross-fitting) to prevent overfitting bias. We demonstrate that the proposed estimators are asymptotically normal and root-n consistent under specific regularity conditions concerning the machine learners. The estimator is available in the causalweight package for the statistical software R.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2020

Evaluating (weighted) dynamic treatment effects by double machine learning

We consider evaluating the causal effects of dynamic treatments, i.e. of...
research
09/09/2022

Estimating Heterogeneous Bounds for Treatment Effects under Sample Selection and Non-response

In this paper we propose a method for nonparametric estimation and infer...
research
12/03/2020

A Generalized Heckman Model With Varying Sample Selection Bias and Dispersion Parameters

Many proposals have emerged as alternatives to the Heckman selection mod...
research
07/06/2020

Cross-Fitting and Averaging for Machine Learning Estimation of Heterogeneous Treatment Effects

We investigate the finite sample performance of sample splitting, cross-...
research
06/02/2022

Coordinated Double Machine Learning

Double machine learning is a statistical method for leveraging complex b...
research
01/30/2022

Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance

Estimation of causal effects using machine learning methods has become a...
research
07/05/2017

Machine Learning Tests for Effects on Multiple Outcomes

A core challenge in the analysis of experimental data is that the impact...

Please sign up or login with your details

Forgot password? Click here to reset