Memory at Your Service: Fast Memory Allocation for Latency-critical Services

09/07/2021
by   Aidi Pi, et al.
0

Co-location and memory sharing between latency-critical services, such as key-value store and web search, and best-effort batch jobs is an appealing approach to improving memory utilization in multi-tenant datacenter systems. However, we find that the very diverse goals of job co-location and the GNU/Linux system stack can lead to severe performance degradation of latency-critical services under memory pressure in a multi-tenant system. We address memory pressure for latency-critical services via fast memory allocation and proactive reclamation. We find that memory allocation latency dominates the overall query latency, especially under memory pressure. We analyze the default memory management mechanism provided by GNU/Linux system stack and identify the reasons why it is inefficient for latency-critical services in a multi-tenant system. We present Hermes, a fast memory allocation mechanism in user space that adaptively reserves memory for latency-critical services. It advises Linux OS to proactively reclaim memory of batch jobs. We implement Hermes in GNU C Library. Experimental result shows that Hermes reduces the average and the 99^th percentile memory allocation latency by up to 54.4 latency-critical services, Hermes reduces both the average and the 99^th percentile tail query latency by up to 40.3 jemalloc and TCMalloc, Hermes reduces Service Level Objective violation by up to 84.3

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2022

Memory Planning for Deep Neural Networks

We study memory allocation patterns in DNNs during inference, in the con...
research
09/11/2023

Adaptive Address Family Selection for Latency-Sensitive Applications on Dual-stack Hosts

Latency is becoming a key factor of performance for Internet application...
research
01/14/2023

Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level

In-memory key-value stores (IMKVSes) serve many online applications beca...
research
10/16/2022

QStack: Re-architecting User-space Network Stack to Optimize CPU Efficiency and Service Quality

TCP/IP network stack is irreplaceable for Web services in datacenter fro...
research
08/19/2020

FIRM: An Intelligent Fine-Grained Resource Management Framework for SLO-Oriented Microservices

Modern user-facing latency-sensitive web services include numerous distr...
research
03/09/2018

ROLP: Runtime Object Lifetime Profiling for Big Data Memory Management

Low latency services such as credit-card fraud detection and website tar...
research
04/10/2018

A Non-blocking Buddy System for Scalable Memory Allocation on Multi-core Machines

Common implementations of core memory allocation components, like the Li...

Please sign up or login with your details

Forgot password? Click here to reset