Micro-architectural Analysis of OLAP: Limitations and Opportunities

08/13/2019
by   Utku Sirin, et al.
0

Understanding micro-architectural behavior is profound in efficiently using hardware resources. Recent work has shown that, despite being aggressively optimized for modern hardware, in-memory online transaction processing (OLTP) systems severely underutilize their core micro-architecture resources [25]. Online analytical processing (OLAP) workloads, on the other hand, exhibit a completely different computing pattern. OLAP workloads are read-only, bandwidth-intensive and include various data access patterns including both sequential and random data accesses. In addition, with the rise of column-stores, they run on high performance engines that are tightly optimized for the efficient use of modern hardware. Hence, the micro-architectural behavior of modern OLAP systems remains unclear. This work presents the micro-architectural analysis of a breadth of OLAP systems. We examine CPU cycles and memory bandwidth utilization. The results show that, unlike the traditional, commercial OLTP systems, traditional, commercial OLAP systems do not suffer from instruction cache misses. Nevertheless, they suffer from their large instruction footprint resulting in slow response times. High performance OLAP engines execute tight instruction streams; however, they spend 25 to 82 of the workload being sequential- or random-access-heavy. In addition, high performance OLAP engines underutilize the multi-core CPU or memory bandwidth resources due to their disproportional compute and memory demands. Hence, analytical processing engines should carefully assign their compute and memory resources for efficient multi-core micro-architectural utilization.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 8

page 9

page 10

page 11

research
03/15/2023

Workload Behavior Driven Memory Subsystem Design for Hyperscale

Hyperscalars run services across a large fleet of servers, serving billi...
research
09/17/2021

Micro-architectural Analysis of a Learned Index

Since the publication of The Case for Learned Index Structures in 2018, ...
research
09/30/2019

Memory Centric Characterization and Analysis of SPEC CPU2017 Suite

In this paper we provide a comprehensive, memory-centric characterizatio...
research
11/21/2022

The AMD Rome Memory Barrier

With the rapid growth of AMD as a competitor in the CPU industry, it is ...
research
09/13/2023

Short reasons for long vectors in HPC CPUs: a study based on RISC-V

For years, SIMD/vector units have enhanced the capabilities of modern CP...
research
07/21/2021

The Bitlet Model: A Parameterized Analytical Model to Compare PIM and CPU Systems

Nowadays, data-intensive applications are gaining popularity and, togeth...
research
01/01/2023

DaeMon: Architectural Support for Efficient Data Movement in Disaggregated Systems

Resource disaggregation offers a cost effective solution to resource sca...

Please sign up or login with your details

Forgot password? Click here to reset