Once is Never Enough: Foundations for Sound Statistical Inference in Tor Network Experimentation

02/10/2021
by   Rob Jansen, et al.
0

Tor is a popular low-latency anonymous communication system that focuses on usability and performance: a faster network will attract more users, which in turn will improve the anonymity of everyone using the system. The standard practice for previous research attempting to enhance Tor performance is to draw conclusions from the observed results of a single simulation for standard Tor and for each research variant. But because the simulations are run in sampled Tor networks, it is possible that sampling error alone could cause the observed effects. Therefore, we call into question the practical meaning of any conclusions that are drawn without considering the statistical significance of the reported results. In this paper, we build foundations upon which we improve the Tor experimental method. First, we present a new Tor network modeling methodology that produces more representative Tor networks as well as new and improved experimentation tools that run Tor simulations faster and at a larger scale than was previously possible. We showcase these contributions by running simulations with 6,489 relays and 792k simultaneously active users, the largest known Tor network simulations and the first at a network scale of 100 we present new statistical methodologies through which we: (i) show that running multiple simulations in independently sampled networks is necessary in order to produce informative results; and (ii) show how to use the results from multiple simulations to conduct sound statistical inference. We present a case study using 420 simulations to demonstrate how to apply our methodologies to a concrete set of Tor experiments and how to analyze the results.

READ FULL TEXT

page 13

page 20

research
06/14/2021

Inference with generalizable classifier predictions

This paper addresses the problem of making statistical inference about a...
research
10/04/2019

A Rademacher Complexity Based Method fo rControlling Power and Confidence Level in Adaptive Statistical Analysis

While standard statistical inference techniques and machine learning gen...
research
01/12/2023

Elucidating Inferential Models with the Cauchy Distribution

Statistical inference as a formal scientific method to covert experience...
research
03/27/2020

Post-sampling crowdsourced data to allow reliable statistical inference: the case of food price indices in Nigeria

Sound policy and decision making in developing countries is often limite...
research
05/21/2017

Statistical inference using SGD

We present a novel method for frequentist statistical inference in M-est...
research
02/08/2023

Towards Inferential Reproducibility of Machine Learning Research

Reliability of machine learning evaluation – the consistency of observed...
research
10/27/2022

Statistical Tools and Methodologies for URLLC – A Tutorial

Ultra-reliable low-latency communication (URLLC) constitutes a key servi...

Please sign up or login with your details

Forgot password? Click here to reset