DeepAI
Log In Sign Up

PPT-SASMM: Scalable Analytical Shared Memory Model: Predicting the Performance of Multicore Caches from a Single-Threaded Execution Trace

03/19/2021
by   Atanu Barai, et al.
0

Performance modeling of parallel applications on multicore processors remains a challenge in computational co-design due to multicore processors' complex design. Multicores include complex private and shared memory hierarchies. We present a Scalable Analytical Shared Memory Model (SASMM). SASMM can predict the performance of parallel applications running on a multicore. SASMM uses a probabilistic and computationally-efficient method to predict the reuse distance profiles of caches in multicores. SASMM relies on a stochastic, static basic block-level analysis of reuse profiles. The profiles are calculated from the memory traces of applications that run sequentially rather than using multi-threaded traces. The experiments show that our model can predict private L1 cache hit rates with 2.12 error rate.

READ FULL TEXT

page 3

page 8

07/29/2019

Modeling Shared Cache Performance of OpenMP Programs using Reuse Distance

Performance modeling of parallel applications on multicore computers rem...
04/11/2021

PPT-Multicore: Performance Prediction of OpenMP applications using Reuse Profiles and Analytical Modeling

We present PPT-Multicore, an analytical model embedded in the Performanc...
09/10/2021

An Effective Early Multi-core System Shared Cache Design Method Based on Reuse-distance Analysis

In this paper, we proposed an effective and efficient multi-core shared-...
10/08/2020

Machine Learning Enabled Scalable Performance Prediction of Scientific Codes

We present the Analytical Memory Model with Pipelines (AMMP) of the Perf...
07/22/2020

Analytical Modeling the Multi-Core Shared Cache Behavior with Considerations of Data-Sharing and Coherence

To mitigate the ever worsening "Power wall" and "Memory wall" problems, ...
03/05/2018

On the accuracy and usefulness of analytic energy models for contemporary multicore processors

This paper presents refinements to the execution-cache-memory performanc...