Multi-level analysis of compiler induced variability and performance tradeoffs

11/14/2018
by   Michael Bentley, et al.
0

Floating-point arithmetic is the computational foundation of numerical scientific software. Compiler optimizations that affect floating-point arithmetic can have a significant impact on the integrity, reproducibility, and performance of HPC scientific applications. Unfortunately the interplay between the compiler-induced variability and runtime performance from compiler optimizations is not well understood by programmers, a problem that is aggravated by the lack of analysis tools in this domain. In this paper, we present a novel set of techniques, as part of a multi-level analysis, that allow programmers to automatically search the space of compiler-induced variability and performance using their own inputs and metrics; these techniques allow programmers to pinpoint the root-cause of non-reproducible behavior to function granularity across multiple compilers and platforms using a bisection algorithm. We have demonstrated our methods on real-world code bases. We provide a performance and reproducibility analysis on the MFEM library as well as a study of compiler characterization by attempting to isolate all 1,086 found instances of result variability. The Laghos proxy app is analyzed and a significant divergent floating-point variability is identified in their code base. Our bisect algorithm pinpointed the problematic function with as little as 14 program executions. Furthermore, an evaluation with 4,376 controlled injections of floating-point perturbations on the LULESH proxy application, found that our framework is 100 file and function location of the injected problem with an average of 15 program executions.

READ FULL TEXT

page 1

page 7

research
07/18/2022

Formally verified 32- and 64-bit integer division using double-precision floating-point arithmetic

Some recent processors are not equipped with an integer division unit. C...
research
05/11/2022

An Efficient Summation Algorithm for the Accuracy, Convergence and Reproducibility of Parallel Numerical Methods

Nowadays, parallel computing is ubiquitous in several application fields...
research
07/03/2023

A numerical variability approach to results stability tests and its application to neuroimaging

Ensuring the long-term reproducibility of data analyses requires results...
research
05/29/2017

Finding Root Causes of Floating Point Error with Herbgrind

Floating point arithmetic plays a central role in science, engineering, ...
research
02/25/2021

NSan: A Floating-Point Numerical Sanitizer

Sanitizers are a relatively recent trend in software engineering. They a...
research
03/03/2023

Automating Constraint-Aware Datapath Optimization using E-Graphs

Numerical hardware design requires aggressive optimization, where design...
research
05/12/2021

On the reproducibility of fully convolutional neural networks for modeling time-space evolving physical systems

Reproducibility of a deep-learning fully convolutional neural network is...

Please sign up or login with your details

Forgot password? Click here to reset