Efficient Characterization of Hidden Processor Memory Hierarchies

06/12/2018
by   Keith Cooper, et al.
0

A processor's memory hierarchy has a major impact on the performance of running code. However, computing platforms, where the actual hardware characteristics are hidden from both the end user and the tools that mediate execution, such as a compiler, a JIT and a runtime system, are used more and more, for example, performing large scale computation in cloud and cluster. Even worse, in such environments, a single computation may use a collection of processors with dissimilar characteristics. Ignorance of the performance-critical parameters of the underlying system makes it difficult to improve performance by optimizing the code or adjusting runtime-system behaviors; it also makes application performance harder to understand. To address this problem, we have developed a suite of portable tools that can efficiently derive many of the parameters of processor memory hierarchies, such as levels, effective capacity and latency of caches and TLBs, in a matter of seconds. The tools use a series of carefully considered experiments to produce and analyze cache response curves automatically. The tools are inexpensive enough to be used in a variety of contexts that may include install time, compile time or runtime adaption, or performance understanding tools.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2018

ShareJIT: JIT Code Cache Sharing across Processes and Its Practical Implementation

Just-in-time (JIT) compilation coupled with code caching are widely used...
research
05/12/2020

Understanding Memory Access Patterns Using the BSC Performance Tools

The growing gap between processor and memory speeds results in complex m...
research
01/28/2022

Puppeteer: A Random Forest-based Manager for Hardware Prefetchers across the Memory Hierarchy

Over the years, processor throughput has steadily increased. However, th...
research
10/27/2019

Cilkmem: Algorithms for Analyzing the Memory High-Water Mark of Fork-Join Parallel Programs

Software engineers designing recursive fork-join programs destined to ru...
research
04/13/2019

Evaluation of the RIKEN Post-K Processor Simulator

For the purpose of developing applications for Post-K at an early stage,...
research
05/06/2021

Analysis and Improvement of Heterogeneous Hardware Support in Docker Images

Docker images are used to distribute and deploy cloud-native application...
research
02/23/2017

Automatically Tuning the GCC Compiler to Optimize the Performance of Applications Running on Embedded Systems

This paper introduces a novel method for automatically tuning the select...

Please sign up or login with your details

Forgot password? Click here to reset