Slowing Down for Performance and Energy: An OS-Centric Study in Network Driven Workloads

12/13/2021
by   Han Dong, et al.
0

This paper studies three fundamental aspects of an OS that impact the performance and energy efficiency of network processing: 1) batching, 2) processor energy settings, and 3) the logic and instructions of the OS networking paths. A network device's interrupt delay feature is used to induce batching and processor frequency is manipulated to control the speed of instruction execution. A baremetal library OS is used to explore OS path specialization. This study shows how careful use of batching and interrupt delay results in 2X energy and performance improvements across different workloads. Surprisingly, we find polling can be made energy efficient and can result in gains up to 11X over baseline Linux. We developed a methodology and a set of tools to collect system data in order to understand how energy is impacted at a fine-grained granularity. This paper identifies a number of other novel findings that have implications in OS design for networked applications and suggests a path forward to consider energy as a focal point of systems research.

READ FULL TEXT

page 8

page 10

page 12

research
05/28/2019

Energy Efficiency Features of the Intel Skylake-SP Processor and Their Impact on Performance

The overwhelming majority of High Performance Computing (HPC) systems an...
research
06/06/2016

CG-OoO: Energy-Efficient Coarse-Grain Out-of-Order Execution

We introduce the Coarse-Grain Out-of-Order (CG- OoO) general purpose pro...
research
09/07/2021

Efficient Instruction Scheduling using Real-time Load Delay Tracking

Many hardware structures in today's high-performance out-of-order proces...
research
02/24/2020

Snitch: A 10 kGE Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads

Data-parallel applications, such as data analytics, machine learning, an...
research
08/28/2018

TRINITY: Coordinated Performance, Energy and Temperature Management in 3D Processor-Memory Stacks

The consistent demand for better performance has lead to innovations at ...
research
05/03/2023

CHASE: Accelerating Distributed Pointer-Traversals on Disaggregated Memory

Caches at CPU nodes in disaggregated memory architectures amortize the h...
research
04/30/2022

Predict; Do not React for Enabling Efficient Fine Grain DVFS in GPUs

With the continuous improvement of on-chip integrated voltage regulators...

Please sign up or login with your details

Forgot password? Click here to reset