uiCA: Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures

07/29/2021
by   Andreas Abel, et al.
0

Performance models that statically predict the steady-state throughput of basic blocks on particular microarchitectures, such as IACA, Ithemal, llvm-mca, OSACA, or CQA, can guide optimizing compilers and aid manual software optimization. However, their utility heavily depends on the accuracy of their predictions. The average error of existing models compared to measurements on the actual hardware has been shown to lie between 9 this? To answer this question, we propose an extremely simple analytical throughput model that may serve as a baseline. Surprisingly, this model is already competitive with the state of the art, indicating that there is significant potential for improvement. To explore this potential, we develop a simulation-based throughput predictor. To this end, we propose a detailed parametric pipeline model that supports all Intel Core microarchitecture generations released between 2011 and 2021. We evaluate our predictor on an improved version of the BHive benchmark suite and show that its predictions are usually within 1 results, improving upon prior models by roughly an order of magnitude. The experimental evaluation also demonstrates that several microarchitectural details considered to be rather insignificant in previous work, are in fact essential for accurate prediction. Our throughput predictor is available as open source at https://github.com/andreas-abel/uiCA.

READ FULL TEXT
research
10/01/2019

Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels

Useful models of loop kernel runtimes on out-of-order architectures requ...
research
10/08/2022

GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation

Analytical hardware performance models yield swift estimation of desired...
research
10/10/2018

uops.info: Characterizing Latency, Throughput, and Port Usage of Instructions on Intel Microarchitectures

Modern microarchitectures are some of the world's most complex man-made ...
research
02/09/2023

UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models

Diffusion probabilistic models (DPMs) have demonstrated a very promising...
research
01/17/2020

Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network

Recent studies in image classification have demonstrated a variety of te...
research
09/14/2022

Improved proteasomal cleavage prediction with positive-unlabeled learning

Accurate in silico modeling of the antigen processing pathway is crucial...
research
11/10/2017

Predicting Chroma from Luma in AV1

Chroma from luma (CfL) prediction is a new and promising chroma-only int...

Please sign up or login with your details

Forgot password? Click here to reset