Unbiased Experiments in Congested Networks

09/30/2021
by   Bruce Spang, et al.
0

When developing a new networking algorithm, it is established practice to run a randomized experiment, or A/B test, to evaluate its performance. In an A/B test, traffic is randomly allocated between a treatment group, which uses the new algorithm, and a control group, which uses the existing algorithm. However, because networks are congested, both treatment and control traffic compete against each other for resources in a way that biases the outcome of these tests. This bias can have a surprisingly large effect; for example, in lab A/B tests with two widely used congestion control algorithms, the treatment appeared to deliver 150 lower throughput when used by most flows-despite the fact that the two algorithms have identical throughput when used by all traffic. Beyond the lab, we show that A/B tests can also be biased at scale. In an experiment run in cooperation with Netflix, estimates from A/B tests mistake the direction of change of some metrics, miss changes in other metrics, and overestimate the size of effects. We propose alternative experiment designs, previously used in online platforms, to more accurately evaluate new algorithms and allow experimenters to better understand the impact of congestion on their tests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2020

Reducing Interference Bias in Online Marketplace Pricing Experiments

Online marketplace designers frequently run A/B tests to measure the imp...
research
09/13/2023

A Study of Symbiosis Bias in A/B Tests of Recommendation Algorithms

One assumption underlying the unbiasedness of global treatment effect es...
research
02/23/2018

Elasticity Detection: A Building Block for Delay-Sensitive Congestion Control

This paper develops a technique to detect whether the cross traffic comp...
research
10/15/2022

Fair Effect Attribution in Parallel Online Experiments

A/B tests serve the purpose of reliably identifying the effect of change...
research
06/09/2023

Using Auxiliary Data to Boost Precision in the Analysis of A/B Tests on an Online Educational Platform: New Data and New Results

Randomized A/B tests within online learning platforms represent an excit...
research
08/16/2021

Computational extraction of metrics and normative data on the alternative uses test on a set of 420 household objects

The Alternative Uses Test (AUT) is a classical test which has long been ...
research
05/02/2023

Validation of massively-parallel adaptive testing using dynamic control matching

A/B testing is a widely-used paradigm within marketing optimization beca...

Please sign up or login with your details

Forgot password? Click here to reset