Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores

02/02/2018
by   Diego Didona, et al.
0

This paper introduces the concept of size-aware sharding to improve tail latencies for in-memory key-value stores, and describes its implementation in the Minos key-value store. Tail latencies are crucial in distributed applications with high fan-out ratios, because overall response time is determined by the slowest response. Size-aware sharding distributes requests for keys to cores according to the size of the item associated with the key. In particular, requests for small and large items are sent to disjoint subsets of cores. Size-aware sharding improves tail latencies by avoiding head-of-line blocking, in which a request for a small item gets queued behind a request for a large item. Alternative size-unaware approaches to sharding, such as keyhash-based sharding, request dispatching and stealing do not avoid head-of-line blocking, and therefore exhibit worse tail latencies. The challenge in implementing size-aware sharding is to maintain high throughput by avoiding the cost of software dispatching and by achieving load balancing between different cores. Minos uses hardware dispatch for all requests for small items, which form the very large majority of all requests. It achieves load balancing by adapting the number of cores handling requests for small and large items to their relative presence in the workload. We compare Minos to three state-of-the-art designs of in-memory KV stores. Compared to its closest competitor, Minos achieves a 99th percentile latency that is up to two orders of magnitude lower. Put differently, for a given value for the 99th percentile latency equal to 10 times the mean service time, Minos achieves a throughput that is up to 7.4 times higher.

READ FULL TEXT

page 10

page 11

research
06/17/2021

QWin: Enforcing Tail Latency SLO at Shared Storage Backend

Consolidating latency-critical (LC) and best-effort (BE) tenants at stor...
research
05/29/2018

LaKe: An Energy Efficient, Low Latency, Accelerated Key-Value Store

Key-value store is a popular type of cloud computing applications. The p...
research
06/20/2016

Criticality Aware Multiprocessors

Typically, a memory request from a processor may need to go through many...
research
07/24/2021

Tell-Tale Tail Latencies: Pitfalls and Perils in Database Benchmarking

The performance of database systems is usually characterised by their av...
research
12/20/2019

Hurry-up: Scaling Web Search on Big/Little Multi-core Architectures

Heterogeneous multi-core systems such as big/little architectures have b...
research
08/07/2021

Asymmetry-aware Scalable Locking

The pursuit of power-efficiency is popularizing asymmetric multicore pro...
research
09/27/2022

Efficient Asynchronous RPC Calls for Microservices: DeathStarBench Study

Crucial in the performance of microservice applications is the efficient...

Please sign up or login with your details

Forgot password? Click here to reset