Finding Root Causes of Floating Point Error with Herbgrind
Floating point arithmetic plays a central role in science, engineering, and finance by enabling developers to approximately compute with real numbers. To address numerical issues in large floating-point applications, developers must identify root causes, which is difficult because floating point errors are generally silent, non-local, and non-compositional. This paper presents Herbgrind, a tool to help developers identify and address root causes in typical numerical code written in low-level C/C++ and Fortran. Herbgrind tracks dependencies between operations and program outputs to avoid false positives, and abstracts erroneous computations to a simplified program fragment whose improvement can reduce output error. We perform several case studies applying Herbgrind to large, expert-crafted numerical programs and show that it scales to applications spanning hundreds of thousands of lines, correctly handling the low-level details of modern floating point hardware and mathematical libraries, and tracking error across function boundaries and through the heap.
READ FULL TEXT