1 Introduction
Floatingpoint numbers are widely used to represent real numbers in computer programs since they offer a good tradeoff between efficiency and precision. The roundoff error of a floatingpoint expression is the difference between the ideal computation in real arithmetic and the actual floatingpoint computation. These roundoff errors accumulate during numerical computations. Besides having a direct effect on the result of mathematical operations, roundoff errors may significantly impact the control flow of a program. This happens when the guard of a conditional statement contains a floatingpoint expression whose roundoff error makes the actual Boolean value of the guard differ from the value that would be obtained assuming real arithmetic. In this case, the conditional statement is called an unstable test. Unstable tests are an inherent feature of floatingpoint programs. In general, it is not possible to completely avoid them. However, it is possible to mitigate their effect by transforming the original program into another program that conservatively (and soundly) detects and corrects unstable tests.
This paper presents a program transformation technique to transform a given program into a new one that returns either the same result of the original program or a warning when the real and floatingpoint flows may diverge. This transformation is parametric with respect to two Boolean abstractions that take into consideration the roundoff error in the expressions occurring in the guard. The transformation replaces the unstable conditions with more restrictive conditions that are guaranteed to preserve the control flow of stable tests. The correctness of the proposed transformation is formally verified in the Prototype Verification System (PVS) [OwreRS92].
The remainder of the paper is organized as follows. sec:fp_err provides technical background on floatingpoint numbers and roundoff errors. The proposed program transformation technique is presented in sec:transformation. sec:polycarp illustrates this technique by transforming the core logic of an algorithm for polygon containment that is part of a geofencing system developed by NASA. sec:related discusses related work and sec:concl concludes the paper.
2 RoundOff Errors and Unstable Tests
A floatingpoint number can be formalized as a pair of integers , where is called the significand and the exponent of the float [BoldoMunoz06, Daumas2001]. A floatingpoint format is defined as a pair of integers , where is called the precision and is called the minimal exponent. For instance, IEEE single and double precision floatingpoint numbers are specified by the formats and , respectively. A canonical float is a float such that is either a normal or subnormal. A normal float is a float such that the significand cannot be multiplied by the radix and still fit in the format. A subnormal float is a float having the minimal exponent such that its significand can be multiplied by the radix and still fit in the format. Henceforth, will denote the set of floatingpoint numbers in canonical form and the expression will denote a floatingpoint number in . A conversion function is defined to refer to the real number represented by a given float, i.e., .
The expression denotes the floatingpoint number in format closest to . The format will be omitted when clear from the context. Let be a floatingpoint number that represents a real number , the difference is called the roundoff error (or rounding error) of with respect to .
2.1 Unstable tests
Given a set of predefined floatingpoint operations, the corresponding set of operations over real numbers, a finite set of variables representing real values, and a finite set of variables representing floatingpoint values, where and are disjoint, the sets and of arithmetic expressions over real numbers and over floatingpoint numbers, respectively, are defined by the following grammar.
where , , , , , , , . It is assumed that there is a function that associates to each floatingpoint variable a variable representing the real value of . The function converts an arithmetic expression on floatingpoint numbers to an arithmetic expression on real numbers. It is defined by simply replacing each floatingpoint operation with the corresponding one on real numbers and by applying and to floatingpoint values and variables, respectively.
Boolean expressions are defined by the following grammar.
where and . The conjunction , disjunction , negation , , and have the usual classical logic meaning. The symbols and denote the domain of Boolean expressions over real and floatingpoint numbers, respectively. The function converts a Boolean expression on floatingpoint numbers to a Boolean expression on real numbers. Given a variable assignment , denotes the evaluation of the real Boolean expression . Similarly, given and , denotes the evaluation of the floatingpoint Boolean expression .
The expression language considered in this paper contains binary and ary conditionals, let expressions, arithmetic expressions, and a warning exceptional statement. Given a set of function symbols, the syntax of program expressions in is given by the following grammar.
(1)  
where , , , and . The notation denotes a list of branches.
A program is a function declaration of the form , where are pairwise distinct variables in and all free variables appearing in are in . The natural number is called the arity of . The set of programs is denoted as .
When ifthenelse guards contain floatingpoint expressions, the output of the considered program is not only directly influenced by rounding errors, but also by the error of taking the incorrect branch in the case of unstable tests.
Definition 1 (Conditional Instability)
A function declaration is said to have an unstable conditional when its body contains a conditional statement of the form and there exist two assignments and such that for all , and . Otherwise, the conditional expression is said to be stable.
In other words, a conditional statement (or test) is unstable when there exists an assignments from the free variables in to such that evaluates to a different Boolean value with respect to its real valued counterpart . In these cases, the program is said to follow an unstable path, otherwise, when the flows coincide, it is said to follow a stable path.
2.2 FloatingPoint Denotational Semantics
This section presents a compositional denotational semantics for the expression language of Formula (1) that models both real and floatingpoint path conditions and outputs. This semantics is a modification of the one introduced in [MoscatoTDM17] and [TitoloFMM18]
. The proposed semantics collects for each combination of real and floatingpoint program paths: the real and floatingpoint path conditions, two symbolic expressions representing the value of the output assuming the use of real and floatingpoint arithmetic, respectively, and a flag indicating if the element refers to either a stable or an unstable path. This information is stored in a
conditional tuple.Definition 2 (Conditional Tuple)
A conditional tuple is an expression of the form , where , , , , and .
Intuitively, indicates that if the condition is satisfied, the output of the ideal realvalued implementation of the program is and the output of the floatingpoint execution is . The subindex is used to mark by construction whether a conditional tuple corresponds to an unstable path, when , or to a stable path, when . The element represents the output of the warning construct. Let be the set of all conditional error bounds, and be the domain formed by sets of conditional error bounds.
An environment is defined as a function mapping a variable to a set of conditional tuples, i.e., . The empty environment is denoted as and maps every variable to the empty set .
Given , the semantics of program expressions is defined in fig:sem as a function that returns the set of conditional tuples representing the possible real and floatingpoint computations and their corresponding path conditions. The operator denotes the least upper bound of the domain of conditional error bounds.
The semantics of a variable consists of two cases. If belongs to the environment, then the variable has been previously bound to a program expression through a letexpression. In this case, the semantics of is exactly the semantics of . If does not belong to the environment, then is a parameter of the function. Here, a new conditional error bound is added with a placeholder representing the real value of . The semantics of a floatingpoint arithmetic operation is computed by composing the semantics of its operands. The real and floatingpoint values are obtained by applying the corresponding arithmetic operation to the values of the operands. The new conditions are obtained as the combination of the conditions of the operands. The semantics of the expression updates the current environment by associating with variable the semantics of expression .
The semantics of the conditional uses an auxiliary operator .
Definition 3 (Condition propagation operator)
Given and , if , otherwise it is undefined. The definition of naturally extends to sets of conditional tuples: given , .
The semantics of and are enriched with the information about the fact that real and floatingpoint control flows match, i.e., both and have the same value. In addition, new conditional tuples are built to model the unstable cases when real and floatingpoint control flows do not coincide and, therefore, real and floatingpoint computations diverge. For example, if is satisfied but is not, the branch is taken in the floatingpoint computation, but the would have been taken in the real one. In this case, the real condition and its corresponding output are taken from the semantics of , while the floatingpoint condition and its corresponding output are taken from the semantics of . The condition is propagated in order to model that holds but does not. The conditional tuples representing this case are marked with .
Similarly, the semantics of an nary conditional is composed of stable and unstable cases. The stable cases are built from the semantics of all the program subexpressions by enriching them with the information stating that the correspondent guard and its real counterpart hold and all the previous guards and their real counterparts do not hold. All the unstable combinations are built by combining the real parts of the semantics of a program expression and the floatingpoint contributions of a different program expression . In addition, the operator is used to propagate the information that the real guard of and the floatingpoint guard of hold, while the guards of the previous branches do not hold.
3 Program Transformation
In this section, a program transformation is proposed for detecting when roundoff errors affect the evaluation of floatingpoint conditionals and for ensuring that when the floatingpoint control flow diverges from the real one a warning is issued. The proposed transformation takes into account roundoff errors by abstracting the Boolean expressions in the guards of the original program. This is done by means of two Boolean abstractions .
Given , let be the set of free variables in . For all , , and such that , and satisfy the following properties.

.

.
Property 1 states that for all floatingpoint Boolean expressions , implies both and its real counterpart. Symmetrically, Property 2 ensures that implies both the negation of and the negation of its real counterpart.
Example 1
The Boolean abstractions and can be instantiated as follows for conjunctions and disjunction of sign tests. Properties 1 and 2 are formally proven in PVS to hold for the following definitions of and . Let and such that .
The abstractions performed for sign tests are not correct for generic inequalities of the form . In this case, to compensate for the roundoff errors of both expressions, additional floatingpoint operations must be performed. Thus, the roundoff error generated by such operations needs to be considered as well to obtain a sound approximation. The naive application of this strategy leads to a nonterminating transformation. The design of an effective approximation for these generic inequalities is left as future work.
The program transformation is defined as follows.
Definition 4 (Program Transformation)
Let be a floatingpoint program that does not contain any statements, the transformed program is defined as where is defined as follows.
In the case of the binary conditional statement, the then branch of the transformed program is taken when is satisfied. By Property 1, this means that in the original program both and hold and, thus, the then branch is taken in both real and floatingpoint control flows. Similarly, the else branch of the transformed program is taken when holds. This means, by Property 2, that in the original program the else branch is taken in both real and floatingpoint control flows. In the case real and floatingflows diverge, neither nor is satisfied and a warning is returned.
In the case of the nary conditional statements, the guard of the th branch is replaced by the conjunction of and for all the previous branches . By properties 1 and 2, it follows that the transformed program takes the th branch only when the same branch is taken in both real and floatingpoint control flows of the original program. Additionally, a warning is issued by the transformed program when real and floatingpoint control flows of the original program differ.
The following theorem states the correctness of the program transformation . If the transformed program returns an output different from , then the original program follows a stable path and returns the floatingpoint output . Furthermore, in the case the original program presents an unstable behavior, the transformed program returns .
Theorem 3.1 (Program Transformation Correctness)
Given , , and , such that for all , :

for all such that , there exists such that and ;

for all , there exists such that .
The program transformation defined in def:prog_trans has been formalized and th:corr_trans has been proven correct in PVS.^{1}^{1}1This formalization is available at https://shemesh.larc.nasa.gov/fm/PRECiSA.
It is important to remark that the intended semantics of the floatingpoint transformed program is the realvalued semantics of the original one, i.e., the realvalued semantics of the transformed program is irrelevant. Therefore, even if the transformed program presents unstable tests, th:corr_trans ensures that its floatingpoint control flow preserves the control flow of stable tests in the original program.
Example 2
Consider the program eps_line, which is part of the ACCoRD conflict detection and resolution algorithm [DowekMC05]. This function is used to compute an implicitly coordinated horizontal resolution direction for the aircraft involved in a pairwise conflict.
where and are floatingpoint variables. For example, if the values of such variables are assumed to lie in the range , the tool PRECiSA [MoscatoTDM17, TitoloFMM18]
can be used to compute the roundoff error estimation
for . PRECiSA is a tool that overapproximates the roundoff error of floatingpoint programs. It is fully automatic and generates PVS proof certificates that guarantee the correctness of the error estimations with respect to the floatingpoint IEEE754 standard. The following program is obtained by using the transformation with the Boolean approximations of ex:beta.The condition never holds since is a positive number. Therefore, the transformed program never returns 0. Indeed, when is close to , the test is unstable. The transformed program detects these unstable cases and returns a warning.
4 Case Study: PolyCARP algorithm
PolyCARP^{2}^{2}2PolyCARP is available at https://github.com/nasa/polycarp. (Algorithms for Computations with Polygons) [NarkawiczH16, NarkawiczMD17] is a suite of algorithms for geocontainment applications. One of the main applications of PolyCARP is to provide geofencing capabilities to unmanned aerial systems (UAS), i.e., detecting whether a UAS is inside or outside a given geographical region, which is modeled using a 2D polygon with a minimum and a maximum altitude. Another application of PolyCARP is the detection of weather cells, modeled as moving polygons, along an aircraft trajectory.
A core piece of logic in PolyCARP is the polygon containment algorithm, i.e., the algorithm that checks whether or not a point lies in the interior of a polygon. Algorithms for polygon containment have to be carefully implemented since numerical errors may lead to wrong answers, even in cases where the point is far from the boundaries of the polygon. PolyCARP uses several techniques to detect if a point is contained in a polygon. One of these techniques relies on the computation of the winding number. This number corresponds to the number of times the polygon winds around .
Consider two consecutive vertices and of the polygon in the Cartesian plane with the point as the origin. The function winding_number_edge checks in which quadrants and are located and counts how many axes are crossed by the edge . If and belong to the same quadrant, the contribution of the edge to the winding number is 0 since no axis is crossed. If and lie in adjacent quadrants, the contribution is 1 (respectively 1) if moving from to along the edge is in counterclockwise (respectively clockwise) direction. In the case and are in opposite quadrants, the determinant is computed for checking the direction of the edge. If it is counterclockwise the contribution is 2, otherwise it is 2. The winding number is obtained as the sum of the contributions of all the edges of the polygon. If the result is 0 or 4, the point is inside the polygon, otherwise, it is outside.
where
The function winding_number_edge has been verified in PVS using real arithmetic. However, due to floatingpoint errors, taking the incorrect branch for one of the edges in the computation of the winding number may result in an incorrect conclusion about the position of the point with respect to the polygon. In order to overcome this problem, the transformation of def:prog_trans is applied to the function resulting in the following function. Given initial bounds for the input variables, PRECiSA [MoscatoTDM17, TitoloFMM18] can be used to compute the roundoff error estimations for , , , and the determinant, which are denoted , , , , an
Comments
There are no comments yet.