Machine Learning for Variance Reduction in Online Experiments

06/14/2021
by   Yongyi Guo, et al.
0

We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments the estimator has over 70 estimator, and about 19 which adjusts only for pre-experiment values of the outcome.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2021

Towards Optimal Variance Reduction in Online Controlled Experiments

We study the optimal variance reduction solutions for online controlled ...
research
11/30/2021

Efficiency of Regression (Un)-Adjusted Rosenbaum's Rank-based Estimator in Randomized Experiments

A completely randomized experiment allows us to estimate the causal effe...
research
10/05/2020

Empirical Likelihood Inference in Randomized Controlled Trials with High-Dimensional Covariates

In this paper, we propose a data-adaptive empirical likelihood approach ...
research
08/02/2021

The Bias-Variance Tradeoff of Doubly Robust Estimator with Targeted L_1 regularized Neural Networks Predictions

The Doubly Robust (DR) estimation of ATE can be carried out in 2 steps, ...
research
12/05/2021

A Robust, Differentially Private Randomized Experiment for Evaluating Online Educational Programs With Sensitive Student Data

Randomized control trials (RCTs) have been the gold standard to evaluate...
research
11/11/2020

Learning a high-dimensional classification rule using auxiliary outcomes

Correlated outcomes are common in many practical problems. Based on a de...
research
03/09/2019

Two paradoxical results in linear models: the variance inflation factor and the analysis of covariance

A result from a standard linear model course is that the variance of the...

Please sign up or login with your details

Forgot password? Click here to reset