TrafPy: Benchmarking Data Centre Network Systems

by   Christopher W. F. Parsonson, et al.

Benchmarking is commonly used in research fields such as computer architecture design and machine learning as a powerful paradigm for rigorously assessing, comparing, and developing novel technologies. However, the data centre networking community lacks a standard open-access benchmark. This is curtailing the community's understanding of existing systems and hindering the ability with which novel technologies can be developed, compared, and tested. We present TrafPy; an open-access framework for generating both realistic and custom data centre network traffic traces. TrafPy is compatible with any simulation, emulation, or experimentation environment, and can be used for standardised benchmarking and for investigating the properties and limitations of network systems such as schedulers, switches, routers, and resource managers. To demonstrate the efficacy of TrafPy, we use it to conduct a thorough investigation into the sensitivity of 4 canonical scheduling algorithms (shortest remaining processing time, fair share, first fit, and random) to varying traffic trace characteristics. We show how the fundamental scheduler performance insights revealed by these tests translate to 4 realistic data centre network types; University, Private Enterprise, Commercial Cloud, and Social Media Cloud. We then draw conclusions as to which types of scheduling policies are most suited to which types of network load conditions and traffic characteristics, leading to the possibility of application-informed decision making at the design stage and new dynamically adaptable scheduling policies. TrafPy is open-sourced via GitHub and all data associated with this manuscript via RDR.


page 1

page 8

page 10

page 36


Datacenter Traffic Control: Understanding Techniques and Trade-offs

Datacenters provide cost-effective and flexible access to scalable compu...

On the Impact of Guest Traffic in Open-Access Domestic Broadband Sharing Schemes

Open-access domestic broadband connection sharing constitutes a voluntar...

Practical Scheduling for Real-World Serverless Computing

Serverless computing has seen rapid growth due to the ease-of-use and co...

The Workflow Trace Archive: Open-Access Data from Public and Private Computing Infrastructures -- Technical Report

Realistic, relevant, and reproducible experiments often need input trace...

The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation

Understanding decision-making in clinical environments is of paramount i...

Analysis of Workflow Schedulers in Simulated Distributed Environments

Task graphs provide a simple way to describe scientific workflows (sets ...

A Reference Architecture for Datacenter Scheduling: Extended Technical Report

Datacenters act as cloud-infrastructure to stakeholders across industry,...