DeepAI AI Chat
Log In Sign Up

Measuring Average Treatment Effect from Heavy-tailed Data

by   Jason, et al.

Heavy-tailed metrics are common and often critical to product evaluation in the online world. While we may have samples large enough for Central Limit Theorem to kick in, experimentation is challenging due to the wide confidence interval of estimation. We demonstrate the pressure by running A/A simulations with customer spending data from a large-scale Ecommerce site. Solutions are then explored. On one front we address the heavy tail directly and highlight the often ignored nuances of winsorization. In particular, the legitimacy of false positive rate could be at risk. We are further inspired by the idea of robust statistics and introduce Huber regression as a better way to measure treatment effect. On another front covariates from pre-experiment period are exploited. Although they are independent to assignment and potentially explain the variation of response well, concerns are that models are learned against prediction error rather than the bias of parameter. We find the framework of orthogonal learning useful, matching not raw observations but residuals from two predictions, one towards the response and the other towards the assignment. Robust regression is readily integrated, together with cross-fitting. The final design is proven highly effective in driving down variance at the same time controlling bias. It is empowering our daily practice and hopefully can also benefit other applications in the industry.


page 1

page 2

page 3

page 4


A Framework for the Meta-Analysis of Randomized Experiments with Applications to Heavy-Tailed Response Data

A central obstacle in the objective assessment of treatment effect (TE) ...

An Optimal Treatment Assignment Strategy to Evaluate Demand Response Effect

Demand response is designed to motivate electricity customers to modify ...

Learning Optimal Biomarker-Guided Treatment Policy for Chronic Disorders

Electroencephalogram (EEG) provides noninvasive measures of brain activi...

Treatment Effect Estimation using Invariant Risk Minimization

Inferring causal individual treatment effect (ITE) from observational da...

The Generalized Oaxaca-Blinder Estimator

After performing a randomized experiment, researchers often use ordinary...

Online Testing of Subgroup Treatment Effects Based on Value Difference

Online A/B testing plays a critical role in the high-tech industry to gu...

Trimmed Match Design for Randomized Paired Geo Experiments

How to measure the incremental Return On Ad Spend (iROAS) is a fundament...