1 Introduction
Information about the memory locations accessed by a program is crucial for many applications such as static data race detection [45], code optimisation [26, 33, 16], program parallelisation [17, 5], and program verification [30, 23, 39, 38]. The problem of inferring this information statically has been addressed by a variety of static analyses, e.g., [9, 42]. However, prior works provide only partial solutions for the important class of arraymanipulating programs for at least one of the following reasons. (1) They approximate the entire array as one single memory location [4] which leads to imprecise results; (2) they do not produce specifications, which are useful for several important applications such as human inspection, test case generation, and especially deductive program verification; (3) they are limited to sequential programs.
In this paper, we present a novel analysis for array programs that addresses these shortcomings. Our analysis employs the notion of access permission from separation logic and similar program logics [40, 43]. These logics associate a permission with each memory location and enforce that a program part accesses a location only if it holds the associated permission. In this setting, determining the accessed locations means to infer a sufficient precondition that specifies the permissions required by a program part.
Phrasing the problem as one of permission inference allows us to address the three problems mentioned above. (1) We distinguish different array elements by tracking the permission for each element separately. (2) Our analysis infers pre and postconditions for both methods and loops and emits them in a form that can be used by verification tools. The inferred specifications can easily be complemented with permission specifications for nonarray data structures and with functional specifications. (3) We support concurrency in three important ways. First, our analysis is sound for concurrent program executions because permissions guarantee that program executions are data race free and reduce thread interactions to specific points in the program such as forking or joining a thread, or acquiring or releasing a lock. Second, we develop our analysis for a programming language with primitives that represent the ownership transfer that happens at these thread interaction points. These primitives, and [31, 38], express that a thread obtains permissions (for instance, by acquiring a lock) or loses permissions (for instance, by passing them to another thread along with a message) and can thereby represent a wide range of thread interactions in a uniform way [32, 44]. Third, our analysis distinguishes read and write access and, thus, ensures exclusive writes while permitting concurrent read accesses. As is standard, we employ fractional permissions [6] for this purpose; a full permission is required to write to a location, but any positive fraction permits read access.
1.0.1 Approach.
Our analysis reduces the problem of reasoning about permissions for array elements to reasoning about numerical values for permission fractions. To achieve this, we represent permission fractions for all array elements [] using a single numerical expression parameterised by and . For instance, the conditional term represents full permission (denoted by 1) for array element a[j] and no permission for all other array elements.
Our analysis employs a precise backwards analysis for loopfree code: a variation on the standard notion of weakest preconditions. We apply this analysis to loop bodies to obtain a permission precondition for a single loop iteration. Per array element, the whole loop requires the maximum fraction over all loop iterations, adjusted by permissions gained and lost during loop execution. Rather than computing permissions via a fixpoint iteration (for which a precise widening operator is difficult to design), we express them as a maximum over the variables changed by the loop execution. We then use inferred numerical invariants on these variables and a novel maximum elimination algorithm to infer a specification for the entire loop. Permission postconditions are obtained analogously.
For the method copyEven in Fig. 2, the analysis determines that the permission amount required by a single loop iteration is . The symbol rd represents a fractional read permission. Using a suitable integer invariant for the loop counter j, we obtain the loop precondition . Our maximum elimination algorithm obtains . By ranging over all and
, this can be read as read permission for even indices and write permission for odd indices within the array a’s bounds.
1.0.2 Contributions.
The contributions of our paper are:

An algorithm for eliminating maximum (and minimum) expressions over an unbounded number of cases (Sec. 5)

An implementation of our analysis, which will be made available as an artifact

An evaluation on benchmark examples from existing papers and competitions, demonstrating that we obtain sound, precise, and compact specifications, even for challenging array access patterns and parallel loops (Sec. 6)

Proof sketches for the soundness of our permission inference and correctness of our maximum elimination algorithm (included in the appendix.)
2 Programming Language
We define our inference technique over the programming language in Fig. 3. Programs operate on integers (expressions ), booleans (expressions ), and onedimensional integer arrays (variables ); a generalisation to other forms of arrays is straightforward and supported by our implementation. Arrays are read and updated via the statements and ; array lookups in expressions are not part of the surface syntax, but are used internally by our analysis. Permission expressions evaluate to rational numbers; rd, , and are for internal use.
A fullfledged programming language contains many statements that affect the ownership of memory locations, expressed via permissions [32, 44]. For example in a concurrent setting, a fork operation may transfer permissions to the new thread, acquiring a lock obtains permission to access certain memory locations, and messages may transfer permissions between sender and receiver. Even in a sequential setting, the concept is useful: in proceduremodular reasoning, a method call transfers permissions from the caller to the callee, and back when the callee terminates. Allocation can be represented as obtaining a fresh object and then obtaining permission to its locations.
For the purpose of our permission inference, we can reduce all of these operations to two basic statements that directly manipulate the permissions currently held [31, 38]. An statement adds the amount of permission for the array location to the currently held permissions. Dually, an statement requires that this amount of permission is already held, and then removes it. We assume that for any or statements, the permission expression denotes a nonnegative fraction. For simplicity, we restrict and statements to a single array location, but the extension to unboundedlymany locations from the same array is straightforward [37].
2.0.1 Semantics.
The operational semantics of our language is mostly standard, but is instrumented with additional state to track how much permission is held to each heap location; a program state therefore consists of a triple of heap (mapping pairs of array identifier and integer index to integer values), a permission map , mapping such pairs to permission amounts, and an environment mapping variables to values (integers or array identifiers).
The execution of or statements causes modifications to the permission map, and all array accesses are guarded with checks that at least some permission is held when reading and that full () permission is held when writing [6]. If these checks (or an statement) fail, the execution terminates with a permission failure. Permission amounts greater than 1 indicate invalid states that cannot be reached by a program execution. We model runtime errors other than permission failures (in particular, outofbounds accesses) as stuck configurations.
3 Permission Inference for LoopFree Code
Our analysis infers a sufficient permission precondition and a guaranteed permission postcondition for each method of a program. Both conditions are mappings from array elements to permission amounts. Executing a statement in a state whose permission map contains at least the permissions required by a sufficient permission precondition for is guaranteed to not result in a permission failure. A guaranteed permission postcondition expresses the permissions that will at least be held when terminates (see Sec. 0.A for formal definitions).
In this section, we define inference rules to compute sufficient permission preconditions for loopfree code. For programs which do not add or remove permissions via and statements, the same permissions will still be held after executing the code; however, to infer guaranteed permission postconditions in the general case, we also infer the difference in permissions between the state before and after the execution. We will discuss loops in the next section. Nonrecursive method calls can be handled by applying our analysis bottomup in the call graph and using and statements to model the permission effect of calls. Recursion can be handled similarly to loops, but is omitted here.
We define our permission analysis to track and generate permission expressions parameterised by two distinguished variables and ; by parameterising our expressions in this way, we can use a single expression to represent a permission amount for each pair of and values.
3.0.1 Preconditions.
The permission precondition of a loopfree statement and a postcondition permission (in which and potentially occur) is denoted by , and is defined in Fig. 4. Most rules are straightforward adaptations of a classical weakestprecondition computation. Array lookups require some permission to the accessed array location; we use the internal expression rd to denote a nonzero permission amount; a postprocessing step can later replace rd by a concrete rational. Since downstream code may require further permission for this location, represented by the permission expression , we take the maximum of both amounts. Array updates require full permission and need to take aliasing into account. The case for subtracts the inhaled permission amount from the permissions required by downstream code; the case for adds the permissions to be exhaled. Note that this addition may lead to a required permission amount exceeding the full permission. This indicates that the statement is not feasible, that is, all executions will lead to a permission failure.
To illustrate our pre definition, let be the body of the loop in the parCopyEven method in Fig. 2. The precondition expresses that a loop iteration requires a half permission for the even elements of array a and full permission for the odd elements.
3.0.2 Postconditions.
The final state of a method execution includes the permissions held in the method prestate, adjusted by the permissions that are inhaled or exhaled during the method execution. To perform this adjustment, we compute the difference in permissions before and after executing a statement. The relative permission difference for a loopfree statement and a permission expression (in which and potentially occur) is denoted by , and is defined backward, analogously to pre in Fig. 4. The second parameter acts as an accumulator; the difference in permission is represented by evaluating .
For a statement with precondition , we obtain the postcondition . Let again be the loop body from parCopyEven. Since contains exhale statements, we obtain . Thus, the postcondition can be simplified to . This reflects the fact that all required permissions for a single loop iteration are lost by the end of its execution.
Since our operator performs a backward analysis, our permission postconditions are expressed in terms of the prestate of the execution of . To obtain classical postconditions, any heap accesses need to refer to the prestate heap, which can be achieved in program logics by using old expressions or logical variables. Formalizing the postcondition inference as a backward analysis simplifies our treatment of loops and has technical advantages over classical strongestpostconditions, which introduce existential quantifiers for assignment statements. A limitation of our approach is that our postconditions cannot capture situations in which a statement obtains permissions to locations for which no prestate expression exists, e.g. allocation of new arrays. Our postconditions are sound; to make them precise for such cases, our inference needs to be combined with an additional forward analysis, which we leave as future work.
4 Handling Loops via Maximum Expressions
In this section, we first focus on obtaining a sufficient permission precondition for the execution of a loop in isolation (independently of the code after it) and then combine the inference for loops with the one for loopfree code described above.
4.1 Sufficient Permission Preconditions for Loops
A sufficient permission precondition for a loop guarantees the absence of permission failures for a potentially unbounded number of executions of the loop body. This concept is different from a loop invariant: we require a precondition for all executions of a particular loop, but it need not be inductive. Our technique obtains such a loop precondition by projecting a permission precondition for a single loop iteration over all possible initial states for the loop executions.
4.1.1 ExhaleFree Loop Bodies.
We consider first the simpler (but common) case of a loop that does not contain statements, e.g., does not transfer permissions to a forked thread. The solution for this case is also sound for loop bodies where each is followed by an for the same array location and at least the same permission amount, as in the encoding of most method calls.
Consider a sufficient permission precondition for the body of a loop . By definition, will denote sufficient permissions to execute once; the precise locations to which requires permission depend on the initial state of the loop iteration. For example, the sufficient permission precondition for the body of the copyEven method in Fig. 2, , requires permissions to different array locations, depending on the value of j. To obtain a sufficient permission precondition for the entire loop, we leverage an overapproximating loop invariant from an offtheshelf numerical analysis (e.g., [14]) to overapproximate all possible values of the numerical variables that get assigned in the loop body, here, j. We can then express the loop precondition using the pointwise maximum , over the values of j that satisfy the condition . (The maximum over an empty range is defined to be .) For the copyEven method, given the invariant , the loop precondition is .
In general, a permission precondition for a loop body may also depend on array values, e.g., if those values are used in branch conditions. To avoid the need for an expensive array value analysis, we define both an over and an underapproximation of permission expressions, denoted and (cf. Sec. 0.A.1), with the guarantees that and . These approximations abstract away arraydependent conditions, and have an impact on precision only when array values are used to determine a location to be accessed. For example, a linear array search for a particular value accesses the array only up to the (apriori unknown) point at which the value is found, but our permission precondition conservatively requires access to the full array.
Theorem 4.1
Let be an exhalefree loop, let be the integer variables modified by , and let be a sound overapproximating numerical loop invariant (over the integer variables in ). Then is a sufficient permission precondition for .
4.1.2 Loops with Exhale Statements.
For loops that contain statements, the approach described above does not always guarantee a sufficient permission precondition. For example, if a loop gives away full permission to the same array location in every iteration, our pointwise maximum construction yields a precondition requiring the full permission once, as opposed to the unsatisfiable precondition (since the loop is guaranteed to cause a permission failure).
As explained above, our inference is sound if each statement is followed by a corresponding , which can often be checked syntactically. In the following, we present another decidable condition that guarantees soundness and that can be checked efficiently by an SMT solver. If neither condition holds, we preserve soundness by inferring an unsatisfiable precondition; we did not encounter any such examples in our evaluation.
Our soundness condition checks that the maximum of the permissions required by two loop iterations is not less than the permissions required by executing the two iterations in sequence. Intuitively, that is the case when neither iteration removes permissions that are required by the other iteration.
Theorem 4.2 (Soundness Condition for Loop Preconditions)
Given a loop , let be the integer variables modified in and let and be two fresh sets of variables, one for each of . Then is a sufficient permission precondition for if the following implication is valid in all states:
The additional variables and are used to model two arbitrary valuations of ; we constrain these to represent two initial states allowed by and different from each other for at least one program variable. We then require that the effect of analysing each loop iteration independently and taking the maximum is not smaller than the effect of sequentially composing the two loop iterations.
The theorem requires implicitly that no two different iterations of a loop observe exactly the same values for all integer variables. If that could be the case, the condition would cause us to ignore a potential pair of initial states for two different loop iterations. To avoid this problem, we assume that all loops satisfy this requirement; it can easily be enforced by adding an additional variable as loop iteration counter [21].
For the parCopyEven method (Fig. 2), the soundness condition holds since, due to the condition, the two terms on the right of the implication are equal for all values of . We can thus infer a sufficient precondition as .
4.2 Permission Inference for Loops
We can now extend the pre and postcondition inference from Sec. 3 with loops. must require permissions such that (1) the loop executes without permission failure and (2) at least the permissions described by are held when the loop terminates. While the former is provided by the loop precondition as defined in the previous subsection, the latter also depends on the permissions gained or lost during the execution of the loop. To characterise these permissions, we extend the operator from Sec. 3 to handle loops.
Under the soundness condition from Thm. 4.2, we can mimic the approach from the previous subsection and use overapproximating invariants to project out the permissions lost in a single loop iteration (where is negative) to those lost by the entire loop, using a maximum expression. This projection conservatively assumes that the permissions lost in a single iteration are lost by all iterations whose initial state is allowed by the loop invariant and loop condition. This approach is a sound overapproximation of the permissions lost.
However, for the permissions gained by a loop iteration (where is positive), this approach would be unsound because the overapproximation includes iterations that may not actually happen and, thus, permissions that are not actually gained. For this reason, our technique handles gained permissions via an underapproximate^{1}^{1}1An underapproximate loop invariant must be true only for states that will actually be encountered when executing the loop. numerical loop invariant (e.g., [35]) and thus projects the gained permissions only over iterations that will surely happen.
This approach is reflected in the definition of our operator below via , which represents the permissions possibly lost or definitely gained over all iterations of the loop. In the former case, we have and, thus, the first summand is 0 and the computation based on the overapproximate invariant applies (note that the negated maximum of negated values is the minimum; we take the minimum over negative values). In the latter case (), the second summand is 0 and the computation based on the underapproximate invariant applies (we take the maximum over positive values).
denotes again the integer variables modified in . The role of is to carry over the permissions that are gained or lost by the code following the loop, taking into account any state changes performed by the loop. Intuitively, the maximum expressions replace the variables in with expressions that do not depend on these variables but nonetheless reflect properties of their values right after the execution of the loop. For permissions gained, these properties are based on the underapproximate loop invariant to ensure that they hold for any possible loop execution. For permissions lost, we use the overapproximate invariant. For the loop in parCopyEven we use the invariant to obtain . Since there are no statements following the loop, and therefore are 0.
Using the same term, we can now define the general case of pre for loops, combining (1) the loop precondition and (2) the permissions required by the code after the loop, adjusted by the permissions gained or lost during loop execution:
Similarly to in the rule for , the expression conservatively overapproximates the permissions required to execute the code after the loop. For method parCopyEven, we obtain a sufficient precondition that is the negation of the . Consequently, the postcondition is 0.
4.2.1 Soundness.
Our pre and definitions yield a sound method for computing sufficient permission preconditions and guaranteed postconditions:
Theorem 4.3 (Soundness of Permission Inference)
For any statement , if every loop in either is exhalefree or satisfies the condition of Thm. 4.2 then is a sufficient permission precondition for , and is a corresponding guaranteed permission postcondition.
Our inference expresses pre and postconditions using a maximum operator over an unbounded set of values. However, this operator is not supported by SMT solvers. To be able to use the inferred conditions for SMTbased verification, we provide an algorithm for eliminating these operators, as we discuss next.
5 A Maximum Elimination Algorithm
We now present a new algorithm for replacing maximum expressions over an unbounded set of values (called pointwise maximum expressions in the following) with equivalent expressions containing no pointwise maximum expressions. Note that, technically our algorithm computes solutions to since some optimisations exploit the fact that the permission expressions our analysis generates always denote nonnegative values.
5.1 Background: Quantifier Elimination
Our algorithm builds upon ideas from Cooper’s classic quantifier elimination algorithm [12] which, given a formula (where is a quantifierfree Presburger formula), computes an equivalent quantifierfree formula . Below, we give a brief summary of Cooper’s approach.
The problem is first reduced via boolean and arithmetic manipulations to a formula in which occurs at most once per literal and with no coefficient. The key idea is then to reduce to a disjunction of two cases: (1) there is a smallest value of making true, or (2) is true for arbitrarily small values of .
In case (1), one computes a finite set of expressions (the in [12]) guaranteed to include the smallest value of . For each (in/dis)equality literal containing in , one collects a boundary expression which denotes a value for making the literal true, while the value would make it false. For example, for the literal one generates the expression . If there are no (non)divisibility constraints in , by definition, will include the smallest value of making true. To account for (non)divisibility constraints such as , the lowestcommonmultiple of the divisors (and ) is returned along with ; the guarantee is then that the smallest value of making true will be for some and . We use to denote the function handling this computation. Then, can be reduced to , where .
In case (2), one can observe that the (in/dis)equality literals in will flip value at finitely many values of , and so for sufficiently small values of , each (in/dis)equality literal in will have a constant value (e.g., will be true). By replacing these literals with these constant values, one obtains a new expression equal to for small enough , and which depends on only via (non)divisibility constraints. The value of will therefore actually be determined by , where is the lowestcommonmultiple of the (non)divisibility constraints. We use to denote the function handling this computation. Then, can be reduced to , where .
In principle, the maximum of a function can be defined using two firstorder quantifiers and . One might therefore be tempted to tackle our maximum elimination problem using quantifier elimination directly. We explored this possibility and found two serious drawbacks. First, the resulting formula does not yield a permissiontyped expression that we can plug back into our analysis. Second, the resulting formulas are extremely large (e.g., for the copyEven example it yields several pages of specifications), and hard to simplify since relevant information is often spread across many terms due to the two separate quantifiers. Our maximum elimination algorithm addresses these drawbacks by natively working with arithmetic expression, while mimicking the basic ideas of Cooper’s algorithm and incorporating domainspecific optimisations.
5.2 Maximum Elimination
The first step is to reduce the problem of eliminating general terms to those in which and come from a simpler restricted grammar. These simple permission expressions do not contain general conditional expressions , but instead only those of the form (where is a constant or rd). Furthermore, simple permission expressions only contain subtractions of the form . This is achieved in a precursory rewriting of the input expression by, for instance, distributing pointwise maxima over conditional expressions and binary maxima. For example, the pointwise maximum term (part of the copyEven example):
will be reduced to:
5.2.1 Arbitrarilysmall Values.
We exploit a highlevel casesplit in our algorithm design analogous to Cooper’s: given a pointwise maximum expression , either a smallest value of exists such that has its maximal value (and is true), or there are arbitrarily small values of defining this maximal value. To handle the arbitrarily small case, we define a completely analogous function, which recursively replaces all boolean expressions in with as computed by Cooper; we relegate the definition to Sec. 0.B.3. We then use , where and , as our expression in this case. Note that this expression still depends on if it contains (non)divisibility constraints; Thm. 5.1 shows how can be eliminated using and .
5.2.2 Selecting Boundary Expressions for Maximum Elimination.
Next, we consider the case of selecting an appropriate set of boundary expressions, given a term. We define this first for in isolation, and then give an extended definition accounting for the . Just as for Cooper’s algorithm, the boundary expressions must be a set guaranteed to include the smallest value of defining the maximum value in question. The set must be finite, and be as small as possible for efficiency of our overall algorithm. We refine the notion of boundary expression, and compute a set of pairs of integer expression and its filter condition : the filter condition represents an additional condition under which must be included as a boundary expression. In particular, in contexts where is false, can be ignored; this gives us a way to symbolically define an ultimatelysmaller set of boundary expressions, particularly in the absence of contextual information which might later show to be false. We call these pairs filtered boundary expressions.
Definition 1 (Filtered Boundary Expressions)
The filtered boundary expression computation for in , written , returns a pair of a set of pairs , and an integer constant , as defined in Fig. 5. This definition is also overloaded with a definition of filtered boundary expression computation for in , written .
Just as for Cooper’s computation, our function computes the set of pairs along with a single integer constant , which is the least common multiple of the divisors occurring in ; the desired smallest value of may actually be some where . There are three key points to Def. 1 which ultimately make our algorithm efficient:
First, the case for only includes boundary expressions for making true. The case of being false (from the structure of the permission expression) is not relevant for trying to maximise the permission expression’s value (note that this case will never apply under a subtraction operator, due to our simplified grammar, and the case for subtraction not recursing into the righthand operand).
Second, the case for dually only considers boundary expressions for making false (along with the boundary expressions for maximising ). The filter condition is used to drop the boundary expressions for making false; in case is not strictly positive we know that the evaluation of the whole permission expression will not yield a strictlypositive value, and hence is not an interesting boundary value for a nonnegative maximum.
Third, in the overloaded definition of , we combine boundary expressions for with those for . The boundary expressions for are, however, superfluous if, in analysing we have already determined a value for which maximises and happens to satisfy . If all boundary expressions for (whose filter conditions are true) make true, and all nontrivial (i.e. strictly positive) evaluations of used for potentially defining ’s maximum value also satisfy , then we can safely discard the boundary expressions for .
We are now ready to reduce pointwise maximum expressions to equivalent maximum expressions over finitelymany cases:
Theorem 5.1 (Simple Maximum Expression Elimination)
For any pair , if , then we have:
where , and .
To see how our filter conditions help to keep the set (and therefore, the first iterated maximum on the right of the equality in the above theorem) small, consider the example: (so is , while is ). In this case, evaluating yields the set with the meaning that the boundary expression is considered in all cases, while the boundary expression is only of interest if . The first iterated maximum term would be . We observe that the term corresponding to the boundary value can be simplified to since it contains the two contradictory conditions and . Thus, the entire maximum can be simplified to . Without the filter conditions the result would instead be . In the context of our permission analysis, the filter conditions allow us to avoid generating boundary expressions corresponding e.g. to the integer loop invariants, provided that the expressions generated by analysing the permission expression in question already suffice. We employ aggressive syntactic simplification of the resulting expressions, in order to exploit these filter conditions to produce succinct final answers.
6 Implementation and Experimental Evaluation
Program  LOC  Loops  Size  Prec.  Time 

addLast  12  1 (1)  1.9  ✓  21 
append  13  1 (1)  1.9  ✓  32 
array1  17  2 (2)  0.9  ✗  28 
array2  23  3 (2)  0.9  ✗  35 
array3  23  2 (2)  1.1  ✓  24 
arrayRev  18  1 (1)  3.2  ✓*  28 
bubbleSort  23  2 (2)  1.8  ✓*  34 
copy  16  2 (1)  1.6  ✓  27 
copyEven  17  1 (1)  1.6  ✓  27 
copyEven2  14  1 (1)  1.4  ✗  20 
copyEven3  14  1 (1)  2.2  ✓*  23 
copyOdd  21  2 (1)  2.4  ✓  55 
copyOddBug  19  2 (1)  7.1  ✓  57 
copyPart  17  2 (1)  1.7  ✓  30 
countDown  21  3 (2)  1.1  ✓  32 
diff  31  2 (2)  2.0  ✗  70 
find  19  1 (1)  3.0  ✓  43 
findNonNull  19  1 (1)  3.0  ✓  40 
init  18  2 (1)  1.1  ✓  28 
init2d  23  2 (2)  2.1  ✓  52 
initEven  18  2 (1)  0.9  ✗  26 
initEvenbug  18  2 (1)  1.5  ✗  28 
initNonCnst  18  2 (1)  1.1  ✓  27 
initPart  19  2 (1)  1.1  ✓  30 
Program  LOC  Loops  Size  Prec.  Time 

initPartBug  19  2 (1)  1.5  ✓  31 
insertSort  21  2 (2)  2.5  ✓*  35 
javaBubble  24  2 (2)  2.3  ✓*  32 
knapsack  21  2 (2)  1.3  ✗  45 
lis  37  4 (2)  4.2  ✓  73 
matrixmult  33  3 (3)  1.5  ✓  78 
mergeinter  23  2 (1)  3.4  ✗  56 
mergeintbug  23  2 (1)  2.6  ✗  59 
memcopy  16  2 (1)  1.6  ✓  28 
multarray  26  2 (2)  2.1  ✓  40 
parcopy  20  2 (1)  1.2  ✓  30 
pararray  20  2 (1)  1.2  ✓  31 
parCopyEven  22  2 (1)  5.0  ✓*  79 
parMatrix  35  4 (2)  1.1  ✓  80 
parNested  31  4 (2)  0.5  ✗  57 
relax  33  1 (1)  1.4  ✓*  55 
reverse  21  2 (1)  3.9  ✓  42 
reverseBug  21  2 (1)  1.7  ✓  42 
sanfoundry  27  2 (1)  2.1  ✓  37 
selectSort  26  2 (2)  1.0  ✗  38 
strCopy  16  2 (1)  0.9  ✗  21 
strLen  10  1 (1)  0.8  ✗  15 
swap  15  1 (1)  1.5  ✓  19 
swapBug  15  1 (1)  1.5  ✓  19 
We have developed a prototype implementation of our permission inference. The tool is written in Scala and accepts programs written in the Viper language [38], which provides all the features needed for our purposes.
Given a Viper program, the tool first performs a forward numerical analysis to infer the overapproximate loop invariants needed for our handling of loops. The implementation is parametric in the numerical abstract domain used for the analysis; we currently support the abstract domains provided by the Apron library [24]. As we have yet to integrate the implementation of underapproximate invariants (e.g., [35]), we rely on userprovided invariants, or assume them to be false if none are provided. In a second step, our tool performs the inference and maximum elimination. Finally, it annotates the input program with the inferred specification.
We evaluated our implementation on 43 programs taken from various sources; included are all programs that do not contain strings from the array memory safety category of SVCOMP 2017, all programs from Dillig et al. [15] (except three examples involving arrays of arrays), loop parallelisation examples from VerCors [5], and a few programs that we crafted ourselves. We manually checked that our soundness condition holds for all considered programs. The parallel loop examples were encoded as two consecutive loops where the first one models the forking of one thread per loop iteration (by iteratively exhaling the permissions required for all loop iterations), and the second one models the joining of all these threads (by inhaling the permissions that are left after each loop iteration). For the numerical analysis we used the polyhedra abstract domain provided by Apron. The experiments were performed on a dual core machine with a 2.60 GHz Intel Core i76600U CPU, running Ubuntu 16.04.
An overview of the results is given in Table 1. For each program, we compared the size and precision of the inferred specification with respect to handwritten ones. The running times were measured by first running the analysis 50 times to warm up the JVM and then computing the average time needed over the next 100 runs. The results show that the inference is very efficient. The inferred specifications are concise for the vast majority of the examples. In 35 out of 48 cases, our inference inferred precise specifications. Most of the imprecisions are due to the inferred numerical loop invariants. In all cases, manually strengthening the invariants yields a precise specification. In one example, the source of imprecision is our abstraction of arraydependent conditions (see Sec. 4).
7 Related Work
Much work is dedicated to the analysis of array programs, but most of it focuses on array content, whereas we infer permission specifications. The simplest approach consists of “smashing” all array elements into a single memory location [4]. This is generally quite imprecise, as only weak updates can be performed on the smashed array. A simple alternative is to consider array elements as distinct variables [4], which is feasible only when the length of the array is staticallyknown. Moreadvanced approaches perform syntaxbased [18, 22, 25] or semanticsbased [13, 34] partitions of an array into symbolic segments. These require segments to be contiguous (with the exception of [34]), and do not easily generalise to multidimensional arrays, unlike our approach. Gulwani et al. [20] propose an approach for inferring quantified invariants for arrays by lifting quantifierfree abstract domains. Their technique requires templates for the invariants.
Dillig et al. [15] avoid an explicit array partitioning by maintaining constraints that over and underapproximate the array elements being updated by a program statement. Their work employs a technique for directly generalising the analysis of a single loop iteration (based on quantifier elimination), which works well when different loop iterations write to disjoint array locations. Gedell and Hähnle [17] provide an analysis which uses a similar criterion to determine that it is safe to parallelise a loop, and treat its heap updates as one bulk effect. The condition for our projection over loop iterations is weaker, since it allows the same array location to be updated in multiple loop iterations (like for example in sorting algorithms). Blom et al. [5] provide a specification technique for a variety of parallel loop constructs; our work can infer the specifications which their technique requires to be provided.
Another alternative for generalising the effect of a loop iteration is to use a first order theorem prover as proposed by Kovács and Voronkov [28]. In their work, however, they did not consider nested loops or multidimensional arrays. Other works rely on loop acceleration techniques [1, 7]. In particular, like ours, the work of Bozga et al. [7] does not synthesise loop invariants; they directly infer postconditions of loops with respect to given preconditions, while we additionally infer the preconditions. The acceleration technique proposed in [1] is used for the verification of array programs in the tool Booster [2].
Monniaux and Gonnord [36] describe an approach for the verification of array programs via a transformation to arrayfree Horn clauses. Chakraborty et al. [11]
use heuristics to determine the array accesses performed by a loop iteration and split the verification of an array invariant accordingly. Their noninterference condition between loop iterations is similar to, but stronger than our soundness condition (cf. Sec.
4). Neither work is concerned with specification inference.A wide range of static/shape analyses employ tailored separation logics as abstract domain (e.g., [3, 19, 10, 29, 41]); these works handle recursivelydefined data structures such as linked lists and trees, but not randomaccess data structures such as arrays and matrices. Of these, Gulavani et al. [19] is perhaps closest to our work: they employ an integerindexed domain for describing recursive data structures. It would be interesting to combine our work with such separation logic shape analyses. The problems of automating biabduction and entailment checking for arraybased separation logics have been recently studied by Brotherston et al. [8] and Kimura et al. [27], but have not yet been extended to handle loopbased or recursive programs.
8 Conclusion and Future Work
We presented a precise and efficient permission inference for array programs. Although our inferred specifications contain redundancies in some cases, they are human readable. Our approach integrates well with permissionbased inference for other data structures and with permissionbased program verification.
As future work, we plan to use SMT solving to further simplify our inferred specifications, to support arrays of arrays, and to extend our work to an interprocedural analysis and explore its combination with biabduction techniques.
8.0.1 Acknowledgements.
We thank Seraiah Walter for his earlier work on this topic, and Malte Schwerhoff and the anonymous reviewers for their comments and suggestions. This work was supported by the Swiss National Science Foundation.
References
 [1] F. Alberti, S. Ghilardi, and N. Sharygina. Definability of Accelerated Relations in a Theory of Arrays and Its Applications. In FroCoS, pages 23–39, 2013.
 [2] F. Alberti, S. Ghilardi, and N. Sharygina. Booster: An AccelerationBased Verification Framework for Array Programs. In ATVA, pages 18–23, 2014.
 [3] J. Berdine, C. Calcagno, and P. W. O’Hearn. Smallfoot: Modular Automatic Assertion Checking with Separation Logic. In FMCO, 2005.
 [4] J. Bertrane, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, and X. Rival. Static Analysis and Verification of Aerospace Software by Abstract Interpretation. In AIAA, 2010.
 [5] S. Blom, S. Darabi, and M. Huisman. Verification of Loop Parallelisations. In FASE, pages 202–217, 2015.
 [6] J. Boyland. Checking Interference with Fractional Permissions. In SAS 2003, volume 2694 of LNCS, pages 55–72, 2003.
 [7] M. Bozga, P. Habermehl, R. Iosif, F. Konecný, and T. Vojnar. Automatic Verification of Integer Array Programs. In CAV, pages 157–172, 2009.
 [8] J. Brotherston, N. Gorogiannis, and M. Kanovich. Biabduction (and related problems) in array separation logic. In Proceedings of CADE26, volume 10395 of LNAI, pages 472–490. Springer, 2017.
 [9] C. Calcagno, D. Distefano, P. W. O’Hearn, and H. Yang. Compositional Shape Analysis by Means of BiAbduction. Journal of the ACM, 58(6):26:1–26:66, 2011.
 [10] C. Calcagno, D. Distefano, P. W. O’Hearn, and H. Yang. Compositional shape analysis by means of biabduction. J. ACM, 58(6):26:1–26:66, Dec. 2011.
 [11] S. Chakraborty, A. Gupta, and D. Unadkat. Verifying Array Manipulating Programs by Tiling. In SAS, pages 428–449, 2017.
 [12] D. C. Cooper. Theorem proving in arithmetic without multiplication. Machine intelligence, 7(9199):300, 1972.
 [13] P. Cousot, R. Cousot, and F. Logozzo. A Parametric Segmentation Functor for Fully Automatic and Scalable Array Content Analysis. In POPL, pages 105–118, 2011.
 [14] P. Cousot and N. Halbwachs. Automatic Discovery of Linear Restraints Among Variables of a Program. In POPL, pages 84–96, 1978.
 [15] I. Dillig, T. Dillig, and A. Aiken. Fluid Updates: Beyond Strong vs. Weak Updates. In ESOP, pages 246–266, 2010.
 [16] J. Ferrante, K. J. Ottenstein, and J. D. Warren. The program dependence graph and its use in optimization. In International Symposium on Programming, pages 125–132, 1984.
 [17] T. Gedell and R. Hähnle. Automating Verification of Loops by Parallelization. In LPAR, pages 332–346, 2006.
 [18] D. Gopan, T. W. Reps, and S. Sagiv. A Framework for Numeric Analysis of Array Operations. In POPL, pages 338–350, 2005.
 [19] B. S. Gulavani, S. Chakraborty, G. Ramalingam, and A. V. Nori. BottomUp Shape Analysis. In SAS, pages 188–204, 2009.
 [20] S. Gulwani, B. McCloskey, and A. Tiwari. Lifting Abstract Interpreters to Quantified Logical Domains. In POPL, pages 235–246, 2008.
 [21] A. Gupta and A. Rybalchenko. InvGen: An Efficient Invariant Generator. In CAV, pages 634–640, 2009.
 [22] N. Halbwachs and M. Péron. Discovering Properties About Arrays in Simple Programs. In PLDI, pages 339–348, 2008.
 [23] B. Jacobs, J. Smans, P. Philippaerts, F. Vogels, W. Penninckx, and F. Piessens. Verifast: A powerful, sound, predictable, fast verifier for c and java. In NASA Formal Methods, pages 41–55, 2011.
 [24] B. Jeannet and A. Miné. Apron: A Library of Numerical Abstract Domains for Static Analysis. In CAV, pages 661–667, 2009.
 [25] R. Jhala and K. L. McMillan. Array Abstractions from Proofs. In CAV, pages 193–206, 2007.
 [26] N. P. Johnson, J. Fix, S. R. Beard, T. Oh, T. B. Jablin, and D. I. August. A collaborative dependence analysis framework. In CGO, pages 148–159, 2017.
 [27] D. Kimura and M. Tatsuta. Decision Procedure for Entailment of Symbolic Heaps with Arrays. In APLAS, pages 169–189, 2017.
 [28] L. Kovács and A. Voronkov. Finding Loop Invariants for Programs over Arrays Using a Theorem Prover. In FASE, pages 470–485, 2009.
 [29] Q. L. Le, C. Gherghina, S. Qin, and W.N. Chin. Shape analysis via secondorder biabduction. In CAV, pages 52–68, 2014.
 [30] K. R. M. Leino. Dafny: An automatic program verifier for functional correctness. In LPAR, pages 348–370, 2010.
 [31] K. R. M. Leino and P. Müller. A Basis for Verifying MultiThreaded Programs. In ESOP, volume 5502 of LNCS, pages 378–393, 2009.
 [32] K. R. M. Leino, P. Müller, and J. Smans. Deadlockfree channels and locks. In ESOP, volume 6012 of LNCS, pages 407–426, 2010.
 [33] S. Lerner, D. Grove, and C. Chambers. Composing dataflow analyses and transformations. In POPL, pages 270–282, 2002.
 [34] J. Liu and X. Rival. An Array Content Static Analysis Based on NonContiguous Partitions. Computer Languages, Systems & Structures, 47:104–129, 2017.
 [35] A. Miné. Inferring Sufficient Conditions with Backward Polyhedral UnderApproximations. Electronic Notes in Theoretical Computer Science, 287:89–100, 2012.
 [36] D. Monniaux and L. Gonnord. Cell Morphing: From Array Programs to ArrayFree Horn Clauses. In SAS, pages 361–382, 2016.
 [37] P. Müller, M. Schwerhoff, and A. J. Summers. Automatic verification of iterated separating conjunctions using symbolic execution. In S. Chaudhuri and A. Farzan, editors, Computer Aided Verification (CAV), volume 9779 of LNCS, pages 405–425. SpringerVerlag, 2016.
 [38] P. Müller, M. Schwerhoff, and A. J. Summers. Viper: A Verification Infrastructure for PermissionBased Reasoning. In B. Jobstmann and K. R. M. Leino, editors, VMCAI, volume 9583 of LNCS, pages 41–62. SpringerVerlag, 2016.
 [39] R. Piskac, T. Wies, and D. Zufferey. Grasshopper – complete heap verification with mixed specifications. In E. Ábrahám and K. Havelund, editors, Tools and Algorithms for the Construction and Analysis of Systems, pages 124–139, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg.
 [40] J. Reynolds. Separation logic: A logic for shared mutable data structures. In Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science, LICS ’02, pages 55–74, Washington, DC, USA, 2002. IEEE Computer Society.
 [41] R. N. S. Rowe and J. Brotherston. Automatic cyclic termination proofs for recursive procedures in separation logic. In Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, CPP 2017, pages 53–65, New York, NY, USA, 2017. ACM.
 [42] A. Salcianu and M. C. Rinard. Purity and Side Effect Analysis for Java Programs. In VMCAI, pages 199–215, 2005.
 [43] J. Smans, B. Jacobs, and F. Piessens. Implicit dynamic frames: Combining dynamic frames and separation logic. In S. Drossopoulou, editor, ECOOP 2009 – ObjectOriented Programming, pages 148–172, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.
 [44] A. J. Summers and P. Müller. Automating deductive verification for weakmemory programs. In TACAS, 2018.
 [45] J. W. Voung, R. Jhala, and S. Lerner. RELAY: Static race detection on millions of lines of code. In European Software Engineering Conference and Foundations of Software Engineering (ESECFSE), pages 205–214. ACM, 2007.
Appendix 0.A Auxiliary Inference Definitions
Definition 2 (Sufficient Permission Preconditions)
A permission expression denotes a sufficient permission precondition for a statement iff, for all states satisfying , we have: .
Here, may mention the designated variables and to denote the memory location of array at index , and denotes the evaluation of in the given heap and environment.
Definition 3 (Guaranteed Permission Postconditions)
If is a sufficient permission precondition for a statement then a permission expression is a corresponding guaranteed permission postcondition for w.r.t. iff the following condition holds: For all initial states satisfying , and for all final states such that , we have: .
Note that guaranteed permission postconditions are expressed in terms of prestates are, thus, are evaluated in and (rather than and .
0.a.1 Conditional Approximation
For the handling of loops we need to abstract away array lookups in order to account for the possibility that the corresponding array value is changed by the loop. Next, we describe the operators used to facilitate that. For every boolean expression , we define an overapproximation and and underapproximation such that and independently of the program state. The overapproximation of a comparison , where , is given by
The underapproximation is defined completely analogously with the only difference that the true is replaced with false. For all remaining boolean expressions the approximation is defined recursively; for instance and .
We extend the notion of over and underapproximation to permission expressions . Here, we want that and hold independently of the program state. Again, these operators are defined recursively on the structure of the permission expression in a straight forward manner, two of the more complicated cases are and .
Now, assume that is the precondition or difference inferred for the code after an statement. If the permission expression mentions the array value then it might be the case that we do not have the required permissions to talk about in the state before the
Comments
There are no comments yet.