Cross-Fitting and Averaging for Machine Learning Estimation of Heterogeneous Treatment Effects

07/06/2020
by   Daniel Jacob, et al.
0

We investigate the finite sample performance of sample splitting, cross-fitting and averaging for the estimation of the conditional average treatment effect. Recently proposed methods, so-called meta-learners, make use of machine learning to estimate different nuisance functions and hence allow for fewer restrictions on the underlying structure of the data. To limit a potential overfitting bias, that may result when using machine learning methods, cross-fitting estimators have been proposed. This includes the splitting of the data in different folds. To the best of our knowledge, it is not yet clear how exactly the data should be split and averaged. We employ a simulation study with different data generation processes and consider different estimators that vary in sample-splitting, cross-fitting and averaging procedures. We investigate the performance of each estimator independently on four different meta-learners: The doubly-robust-learner, the R-learner, the T-learner and the X-learner. We find that the performance of all meta-learners heavily depends on the procedure of splitting and averaging. The best performance in terms of mean squared error (MSE) could be achieved when using a 5-fold cross-fitting estimator which is averaged by the median over multiple different sample-splittings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2022

Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance

Estimation of causal effects using machine learning methods has become a...
research
11/07/2019

Group Average Treatment Effects for Observational Studies

The paper proposes an estimator to make inference on key features of het...
research
11/30/2020

Double machine learning for sample selection models

This paper considers treatment evaluation when outcomes are only observe...
research
06/03/2022

Debiased Machine Learning without Sample-Splitting for Stable Estimators

Estimation and inference on causal parameters is typically reduced to a ...
research
01/27/2018

Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation

There are many interesting and widely used estimators of a functional wi...
research
12/30/2022

Heterogeneous Synthetic Learner for Panel Data

In the new era of personalization, learning the heterogeneous treatment ...
research
03/22/2023

On a General Class of Orthogonal Learners for the Estimation of Heterogeneous Treatment Effects

Motivated by applications in personalized medicine and individualized po...

Please sign up or login with your details

Forgot password? Click here to reset