Hurry-up: Scaling Web Search on Big/Little Multi-core Architectures

12/20/2019
by   Rajiv Nishtala, et al.
0

Heterogeneous multi-core systems such as big/little architectures have been introduced as an attractive server design option with the potential to improve performance under power constraints in data centres. Since both big high-performing and little power-efficient cores can run on the same system sharing the workload processing, thread mapping/scheduling turns out to be much more challenging. This is particularly hard when considering the different trade-offs shaped by the heterogeneous cores on the quality-of-service (expressed as tail latency) experienced by user-facing applications, such as Web Search. In this work, we present Hurry-up, a runtime thread mapping solution designed to select individual requests to run on the most appropriate heterogeneous cores to improve tail latency. Hurry-up accelerates compute-intensive requests on big cores, while letting less intensive threads to execute on little cores. We implement and deploy Hurry-up on a real 64-bit big/little architecture (ARM Juno), and show that, compared to a conservative policy on Linux, Hurry-up reduces the server tail latency by 39.5

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/07/2021

Asymmetry-aware Scalable Locking

The pursuit of power-efficiency is popularizing asymmetric multicore pro...
08/24/2020

Evaluation of hybrid run-time power models for the ARM big.LITTLE architecture

Heterogeneous processors, formed by binary compatible CPU cores with dif...
09/10/2021

Analytical Process Scheduling Optimization for Heterogeneous Multi-core Systems

In this paper, we propose the first optimum process scheduling algorithm...
06/17/2021

QWin: Enforcing Tail Latency SLO at Shared Storage Backend

Consolidating latency-critical (LC) and best-effort (BE) tenants at stor...
03/14/2019

High-Throughput CNN Inference on Embedded ARM big.LITTLE Multi-Core Processors

IoT Edge intelligence requires Convolutional Neural Network (CNN) infere...
02/02/2018

Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores

This paper introduces the concept of size-aware sharding to improve tail...
11/30/2020

HeM3D: Heterogeneous Manycore Architecture Based on Monolithic 3D Vertical Integration

Heterogeneous manycore architectures are the key to efficiently execute ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.