A Common Misassumption in Online Experiments with Machine Learning Models

04/21/2023
by   Olivier Jeunen, et al.
0

Online experiments such as Randomised Controlled Trials (RCTs) or A/B-tests are the bread and butter of modern platforms on the web. They are conducted continuously to allow platforms to estimate the causal effect of replacing system variant "A" with variant "B", on some metric of interest. These variants can differ in many aspects. In this paper, we focus on the common use-case where they correspond to machine learning models. The online experiment then serves as the final arbiter to decide which model is superior, and should thus be shipped. The statistical literature on causal effect estimation from RCTs has a substantial history, which contributes deservedly to the level of trust researchers and practitioners have in this "gold standard" of evaluation practices. Nevertheless, in the particular case of machine learning experiments, we remark that certain critical issues remain. Specifically, the assumptions that are required to ascertain that A/B-tests yield unbiased estimates of the causal effect, are seldom met in practical applications. We argue that, because variants typically learn using pooled data, a lack of model interference cannot be guaranteed. This undermines the conclusions we can draw from online experiments with machine learning models. We discuss the implications this has for practitioners, and for the research literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2023

Using Auxiliary Data to Boost Precision in the Analysis of A/B Tests on an Online Educational Platform: New Data and New Results

Randomized A/B tests within online learning platforms represent an excit...
research
04/08/2021

Causal Decision Making and Causal Effect Estimation Are Not the Same... and Why It Matters

Causal decision making (CDM) at scale has become a routine part of busin...
research
05/07/2021

Precise Unbiased Estimation in Randomized Experiments using Auxiliary Observational Data

Randomized controlled trials (RCTs) are increasingly prevalent in educat...
research
05/10/2021

An introduction to causal reasoning in health analytics

A data science task can be deemed as making sense of the data and/or tes...
research
04/05/2021

Revisiting Rashomon: A Comment on "The Two Cultures"

Here, I provide some reflections on Prof. Leo Breiman's "The Two Culture...
research
02/17/2021

Big Data meets Causal Survey Research: Understanding Nonresponse in the Recruitment of a Mixed-mode Online Panel

Survey scientists increasingly face the problem of high-dimensionality i...
research
10/25/2021

The Efficiency Misnomer

Model efficiency is a critical aspect of developing and deploying machin...

Please sign up or login with your details

Forgot password? Click here to reset