Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT

by   Julian Hornich, et al.

Stencil algorithms have been receiving considerable interest in HPC research for decades. The techniques used to approach multi-core stencil performance modeling and engineering span basic runtime measurements, elaborate performance models, detailed hardware counter analysis, and thorough scaling behavior evaluation. Due to the plurality of approaches and stencil patterns, we set out to develop a generalizable methodology for reproducible measurements accompanied by state-of-the-art performance models. Our open-source toolchain, and collected results are publicly available in the "Intranode Stencil Performance Evaluation Collection" (INSPECT). We present the underlying methodologies, models and tools involved in gathering and documenting the performance behavior of a collection of typical stencil patterns across multiple architectures and hardware configuration options. Our aim is to endow performance-aware application developers with reproducible baseline performance data and validated models to initiate a well-defined process of performance assessment and optimization.



There are no comments yet.


page 2

page 3

page 4

page 5

page 6

page 8

page 11

page 12


White-Box Analysis over Machine Learning: Modeling Performance of Configurable Systems

Performance-influence models can help stakeholders understand how and wh...

An LLVM Instrumentation Plug-in for Score-P

Reducing application runtime, scaling parallel applications to higher nu...

Performance Optimization and Parallelization of a Parabolic Equation Solver in Computational Ocean Acoustics on Modern Many-core Computer

As one of open-source codes widely used in computational ocean acoustics...

Towards the Framework of the File Systems Performance Evaluation Techniques and the Taxonomy of Replay Traces

This is the era of High Performance Computing (HPC). There is a great de...

In Datacenter Performance, The Only Constant Is Change

All computing infrastructure suffers from performance variability, be it...

Fallout: Distributed Systems Testing as a Service

All modern distributed systems list performance and scalability as their...

A Parametric Microarchitecture Model for Accurate Basic Block Throughput Prediction on Recent Intel CPUs

Performance models that statically predict the steady-state throughput o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.