QWin: Enforcing Tail Latency SLO at Shared Storage Backend

06/17/2021
by   Liuying Ma, et al.
0

Consolidating latency-critical (LC) and best-effort (BE) tenants at storage backend helps to increase resources utilization. Even if tenants use dedicated queues and threads to achieve performance isolation, threads are still contend for CPU cores. Therefore, we argue that it is necessary to partition cores between LC and BE tenants, and meanwhile each core is dedicated to run a thread. Expect for frequently changing bursty load, fluctuated service time at storage backend also drastically changes the need of cores. In order to guarantee tail latency service level objectives (SLOs), the abrupt changing need of cores must be satisfied immediately. Otherwise, tail latency SLO violation happens. Unfortunately, partitioning-based approaches lack the ability to react the changing need of cores, resulting in extreme spikes in latency and SLO violation happens. In this paper, we present QWin, a tail latency SLO aware core allocation to enforce tail latency SLO at shared storage backend. QWin consists of an SLO-to-core calculation model that accurately calculates the number of cores combining with definitive runtime load determined by a flexible request-based window, and an autonomous core allocation that adjusts cores at adaptive frequency by dynamically changing core policies. When consolidating multiple LC and BE tenants, QWin outperforms the-state-of-the-art approaches in guaranteeing tail latency SLO for LC tenants and meanwhile increasing bandwidth of BE tenants by up to 31x.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2022

Predictable Sharing of Last-level Cache Partitions for Multi-core Safety-critical Systems

Last-level cache (LLC) partitioning is a technique to provide temporal i...
research
02/02/2018

Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores

This paper introduces the concept of size-aware sharding to improve tail...
research
12/20/2019

Hurry-up: Scaling Web Search on Big/Little Multi-core Architectures

Heterogeneous multi-core systems such as big/little architectures have b...
research
10/23/2020

The nanoPU: Redesigning the CPU-Network Interface to Minimize RPC Tail Latency

The nanoPU is a new networking-optimized CPU designed to minimize tail l...
research
05/03/2023

Scheduling Network Function Chains Under Sub-Millisecond Latency SLOs

Network Function Virtualization (NFV) seeks to replace hardware middlebo...
research
07/26/2016

Uber: Utilizing Buffers to Simplify NoCs for Hundreds-Cores

Approaching ideal wire latency using a network-on-chip (NoC) is an impor...
research
06/23/2023

RETROSPECTIVE: Corona: System Implications of Emerging Nanophotonic Technology

The 2008 Corona effort was inspired by a pressing need for more of every...

Please sign up or login with your details

Forgot password? Click here to reset