Optimal Probing with Statistical Guarantees for Network Monitoring at Scale

09/16/2021
by   Muhammad Jehangir Amjad, et al.
0

Cloud networks are difficult to monitor because they grow rapidly and the budgets for monitoring them are limited. We propose a framework for estimating network metrics, such as latency and packet loss, with guarantees on estimation errors for a fixed monitoring budget. Our proposed algorithms produce a distribution of probes across network paths, which we then monitor; and are based on A- and E-optimal experimental designs in statistics. Unfortunately, these designs are too computationally costly to use at production scale. We propose their scalable and near-optimal approximations based on the Frank-Wolfe algorithm. We validate our approaches in simulation on real network topologies, and also using a production probing system in a real cloud network. We show major gains in reducing the probing budget compared to both production and academic baselines, while maintaining low estimation errors, even with very low probing budgets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2016

Towards a Cognitive Routing Engine for Software Defined Networks

Most Software Defined Networks (SDN) traffic engineering applications us...
research
11/16/2019

Memory-Efficient Performance Monitoring on Programmable Switches with Lean Algorithms

Network performance problems are notoriously difficult to diagnose. Prio...
research
02/06/2019

KISS methodologies for network management and anomaly detection

Current networks are increasingly growing in size and complexity and so ...
research
12/02/2019

SSNdesign – an R package for pseudo-Bayesian optimal and adaptive sampling designs on stream networks

Streams and rivers are biodiverse and provide valuable ecosystem service...
research
02/07/2019

ML Health: Fitness Tracking for Production Models

Deployment of machine learning (ML) algorithms in production for extende...
research
06/30/2010

Dynamic and Transparent Analysis of Commodity Production Systems

We propose a framework that provides a programming interface to perform ...
research
10/11/2021

Towards a Cost vs. Quality Sweet Spot for Monitoring Networks

Continuously monitoring a wide variety of performance and fault metrics ...

Please sign up or login with your details

Forgot password? Click here to reset