Separation logic ()  is a well-known and popular Hoare-style framework for verifying the memory safety of heap-manipulating programs. Its power stems from the use of separating conjunction in its assertion language, where denotes a portion of memory that can be split into two disjoint fragments satisfying and respectively. Using separating conjunction, the frame rule becomes sound , capturing the fact that any valid Hoare triple can be extended with the same separate memory in its pre- and postconditions and remain valid, which empowers the framework to scale to large programs (see e.g. ). Indeed, separation logic now forms the basis for verification tools used in industrial practice, notably Facebook’s Infer  and Microsoft’s SLAyer .
Most separation logic analyses and tools restrict the form of assertions to a simple propositional structure known as symbolic heaps . Symbolic heaps are (possibly existentially quantified) pairs of so-called “pure” and “spatial” assertions, where pure assertions mention only equalities and disequalities between variables and spatial formulas are -conjoined lists of pointer formulas and data structure formulas typically describing segments of linked lists () or sometimes binary trees. This fragment of the logic enjoys decidability in polynomial time  and is therefore highly suitable for use in large-scale analysers. However, in recent years, various authors have investigated the computational complexity of (and/or developed prototype analysers for) many other fragments employing various different assertion constructs, including user-defined inductive predicates [18, 5, 7, 1, 10], pointers with fractional permissions [22, 13], arrays [6, 19], separating implication () [9, 4], reachability predicates  and arithmetic [20, 21].
It is with this last feature, arithmetic, with which we are concerned in this paper. In general, assertions involving arithmetic arise naturally and for obvious reasons when analysing arithmetical programs; moreover, the use of pointer arithmetic, where pointers are treated explicitly as numerical addresses which can be manipulated arithmetically, is a standard feature e.g. of C code. We therefore set out by asking the following question: How much pointer arithmetic can one add to separation logic and remain within polynomial time?
Unfortunately, and perhaps surprisingly, the answer turns out to be: essentially none at all.
We study the complexity of symbolic-heap separation logic with pointers, but no other data structures, when pure formulas are extended by arithmetical constraints, in two variants. The first variant encapsulates a minimal language for pointer arithmetic, allowing only conjunctions of “difference constraints” (where is an integer), whereas the second is more expressive, allowing arbitrary Boolean combinations of elementary formulas over arbitrary pointer-and-offset sums.
We certainly do not claim that either fragment is appropriate for practical program verification; clearly, lacking constructs for lists or other data structures, they will be insufficiently expressive for most purposes (although they might be practical e.g. for some concurrent programs that deal only with shared memory buffers of a small fixed size). The point is that any practical fragment of separation logic employing arithmetic will almost inevitably include our minimal language and thus inherit its computational lower bounds.
Our complexity results for SL pointer arithmetic are summarised in Table 1. Perhaps our most striking result is that, even for the case of our minimal SL pointer arithmetic where only constant pointer offsets and conjunctions are permitted, the satisfiability problem is already -complete. On the other hand, the problem is still in when we extend to full pointer arithmetic. However, there is at least one material difference between the two fragments: minimal pointer arithmetic enjoys the small model property, meaning that any satisfiable symbolic heap has a model of size polynomial in the size of , whereas this property fails for full pointer arithmetic.
In the case of the entailment problem, the story is somewhat similar: for quantifier-free entailments the problem becomes -complete, irrespective of whether we consider minimal or full pointer arithmetic. However, the complexity appears to increase drastically for quantified entailments, where the problem is -complete for minimal pointer arithmetic but -complete for full pointer arithmetic. ( is the second class in the polynomial-time hierarchy  and is the first class in the exponential-time hierarchy, which corresponds to Presburger arithmetic ).
|minimal pointer arithmetic||full pointer arithmetic|
|Small model property||yes||no|
The remainder of this paper is structured as follows. In Section 2 we define symbolic-heap separation logic with pointer arithmetic, in both “minimal” and “full” flavours. Sections 3 and 4 study the satisfiability and entailment problems, respectively, for our minimal and full versions of SL pointer arithmetic, establishing upper and lower complexity bounds for all cases. In Section 5 we establish the small model property and thereby the upper bound for the quantified entailments within minimal pointer arithmetic. Section 6 concludes.
2 Separation logic with pointer arithmetic
Here, we introduce our language of separation logic with pointer arithmetic, building on the well-known “symbolic heap” fragment over pointers .
Because we have to take into account the balance between the arithmetical part and the spatial part of the language, we consider two varieties of pointer arithmetic: a “minimal” fragment containing only the bare essentials, and a “full” fragment allowing greater expressivity. To show lower complexity bounds, we have to challenge the fact that Presburger arithmetic is already -hard by itself; thus, to reveal the true memory-related nature of the problem, we restrict the arithmetical part of the language by restricting the pure part of our language to something so simple that it can be processed in polynomial time.. This leads us to consider minimal pointer arithmetic, in which we allow only conjunctions of ‘difference constraints’ of the form , and where and are variables and is an integer (even negation is not permitted). On the other hand, for upper complexity bounds, it stands to reason that we should aim for as much expressivity as possible while remaining within a particular complexity class. Thus we also consider full pointer arithmetic, in which arbitrary Boolean combinations of elementary formulas over arbitrary pointer sums are permitted.
Definition 1 ( pointer arithmetic).
A symbolic heap is given by
where is a tuple of variables from an infinite set , and and are respectively pure and spatial formulas, defined below.
For full pointer arithmetic, we define terms , pure formulas , and spatial formulas by the following grammar:
where ranges over .
For minimal pointer arithmetic, we instead define terms , pure formulas , and spatial formulas by the following simpler grammar:
Whenever one of is empty in a symbolic heap , we omit the colon.
In the case of minimal pointer arithmetic, the pure part of a symbolic heap is a conjunction of ‘difference constraints’ of the form or , where and are variables, and is a fixed offset in . The satisfiability of such formulas can be decided in polynomial time; see . The crucial observation is:
A ‘circular’ system of difference constraints , , …, , allows one to conclude that , which is a contradiction iff the latter sum is negative.
Thus, considering our symbolic heaps in minimal pointer arithmetic readdresses the challenge of establishing relevant lower bounds to the spatial part of the language.
Semantics. As usual, we interpret symbolic heaps in a stack-and-heap model; for convenience we consider both locations to be natural numbers, and values to be either natural numbers or the non-addressable null value . Thus a stack is a function . We extend stacks over terms as usual: , and . If is a stack, and is a value, we write for the stack defined as except that . We extend stacks pointwise over term tuples.
A heap is a finite partial function mapping finitely many locations to values; we write for the domain of , and for the empty heap that is undefined on all locations. We write for composition of domain-disjoint heaps: if and are heaps, then is the union of and when and are disjoint, and undefined otherwise.
The satisfaction relation , where is a stack, a heap and a symbolic heap, is defined by structural induction on .
Here we establish upper and lower complexity for the satisfiability problem in both the minimal and full variants of our pointer arithmetic.
Let be a symbolic heap of the form
We describe the heap models of by means of the following Presburger formula obtained by enriching the pure part with the constraints on that , the allocated addresses, must be distinct (here ,.., is the list of all variables):
The above can be easily rewritten as a Boolean combination of elementary formulas of the form , where the ‘offset’ is a variable or an integer.
Any model for can be transformed into a model for , and vice versa.
By definition, given an , a model for , we have is true, and is the disjoint collection of the corresponding cells:
which implies that .
Conversely, assume a mapping provides an evaluation which makes true. Then is true, and, in addition, we can take a heap as the disjoint collection of the cells in accordance with (3), which provides: . ∎
Satisfiability is in .
Satisfiability is shown -hard by reduction from the -colourability problem .
Problem 1 (-colourability).
Let be an undirected graph with vertices . The -colourability problem is to decide if there is a -colouring of its vertices such that no two adjacent vertices share the same colour.
Let be an instance graph with vertices. We encode the perfect -colourings of with the following symbolic heap .
We use to denote one of the colours, , , or , the vertex is marked by.
To encode the fact that no two adjacent vertices and share the same colour, we use and as the addresses, relative to the base-offset , for two disjoint cells. To ensure that all cells allocated in question are disjoint, with , we introduce the numbers as:
Our choice is motivated, in particular, by needs of Definition 8 where its is guaranteed to be satisfiable whenever we allow memory chunks of length to accommodate any of distinct colours used in the trivially realizable -colouring problem.
Let pairs and be distinct. Then
Formally, we define to be the following quantifier-free symbolic heap:
Notice that is in minimal pointer arithmetic.
Let be an instance of the -colouring problem. Then from Definition 4 is satisfiable iff there is a perfect -colouring of .
Any perfect -colouring of , with vertices labelled by colours , yields a model for with a stack defined as . The corresponding cells, , are all disjoint because of Proposition 2.
Conversely, given a model for , we label each of the vertices by the colour , providing a perfect -colouring of . ∎
Satisfiability is -hard, even for quantifier-free symbolic heaps in minimal pointer arithmetic.
From Lemma 2. ∎
Satisfiability is -complete, even for quantifier-free symbolic heaps in minimal pointer arithmetic.
3.1 About the small model property
As for the size of models for symbolic heaps in Corollary 1, we establish the following small model property (that is , any satisfiable formula has a model of size polynomial in the size of ) but not for full pointer arithmetic, cf. Remark 1.
On the contrary, no small model property is valid whenever we allow , with being a variable.
Let be a symbolic heap of the form (here )
Then we have that for any model of , which implies . Thus, all models of necessarily require (the distances between) at least a half of addresses in to be of exponential size. ∎
In order to prove the small model property, we need a more workable specification of :
Any model for a symbolic heap can be determined by
a Boolean vector
can be determined by a Boolean vectorsuch that and the following system, , has an integer solution:
Given a model of , we can evaluate each of the , and then calculate the appropriate by means of the equations in (7). ∎
In its turn, the system , (7), will be encoded by a constraint graph, , constructed as follows.
With each variable , we will associate the node labelled by .
In the case of , we depict the arrow from the node to the node and label it with .
In the case of , which means that “”, we depict the opposite arrow from the node to the node and label it with the number .
To provide the connectivity we need, we will add, if necessary, a “maximum node” , with the constraint “” for all . Cf. Figure 1.
Let be a symbolic heap of the form:
with its being of the form: .
Clearly, , where
In Figure 1 we show the constraint graphs for and , resp. Notice that, because of , the node is a “maximum node” in both cases.
In the case of (a), we have no solution. Namely, there is a negative cycle of the form , which provides a contradictory .
In the case of (b), the minimal weighted path from to is of the weight , which guarantees that is a model for and thereby for .
Theorem 3.2 (“the small model property”).
Let be a satisfiable symbolic heap in minimal pointer arithmetic. Then we can find a model for in which all values are bounded by , which it suffices to take as: , where ranges over all occurrences of numbers occurred in .
According to Proposition 3, there is a Boolean vector such that the corresponding system, , has a solution. Hence, the associated constraint graph, , has no negative cycles, see Definition 6 and Proposition 1.
We define our small model with the following mapping with providing an evaluation which makes true. First we define that , for the “maximum node” - so that for all . Then is defined as: , where is the minimal weighted path leading from to .
E.g., in Example 1 the small model is given by , and . ∎
Contrary to Remark 1, Theorem 3.2 is valid even for full pointer arithmetic, whenever we confine ourselves to the pointer terms of the form , with being a fixed base-offset, but any Boolean combinations of the elementary formulas , , and , are allowed.
In addition, the corresponding polytime sub-procedures are running as the shortest paths procedures with negative weights allowed (e.g., Bellman-Ford algorithm), with providing polynomials of low degrees.
We now focus on the entailment problem: iff every model of is also a model of .
Let be a symbolic heap of the form
and be a symbolic heap of the form
both and are symbolic heaps in the minimal pointer arithmetic.
We express validity of , that is, every model of is also a model of , by means of the formula :
where the following formula, , establishes an isomorphism between the disjoint collection of the cells: , and the disjoint collection of the cells: ,
Each of the above , , and can be easily rewritten as a Boolean combination of elementary formulas of the form , where the ‘offset’ is a variable or an integer (in the case of minimal pointer arithmetic, is a fixed integer).
Thus our can be rewritten as:
where is a Boolean combination of elementary formulas of the form .
Any model , which is a counter-model for , can be transformed into a model for , and vice versa.
Similar to Lemma 1. ∎
4.1 Upper and Lower Bounds
Here we establish the following upper and lower bounds for the general quantified entailment problem. Namely,
For full pointer arithmetic, the entailment problem belongs to the class Presburger , by which we denote, with a quantifier-free , the class of formulas in the Presburger arithmetic of the form
For minimal pointer arithmetic, the entailment problem is proved to be at least -complete, where is the second class in the polynomial time hierarchy .
The crucial difference between Presburger and polynomial is that for the latter all variables should be polynomially bounded.
The entailment problem with quantified and is in Presburger .
According to Lemma 3, is valid iff the following holds:
The latter belongs to Presburger . ∎
The lower bound is the same:
Since we have allowed arbitrary Boolean combinations of the elementary formulas , , and , we can simulate the class Presburger , providing Presburger hardness, even within the pure part of our language.
The crucial difference between Presburger and polynomial is that for the latter all variables should be polynomially bounded. 111According to Theorem 5.1, given and , symbolic heaps in minimal pointer arithmetic, is valid if and only if within the corresponding form (10) representing (12), all are bounded by and all by , where is defined as: , with ranging over all occurrences of these ‘offset’ numbers occurred in and . Here is a Boolean combination of the elementary formulas , , and , where the ‘offset’ is a fixed integer.
4.2 Quantified minimal arithmetic: A lower bound
To prove -hardness in the quantified case for the minimal pointer arithmetic, we use the following constructions.
-round -colourability problem.
Let be an undirected graph with vertices , and let be its leaves. The problem is to decide if every -colouring of the leaves can be extended to a -colouring of the graph, such that no two adjacent vertices share the same colour.
Let be an instance graph with vertices and leaves. In addition to the variables in Definition 4, to each edge we associate , representing the colour “complementary” to and .
To encode the fact that no two adjacent vertices and share the same colour, we intend to use , , and as the addresses, relative to the base-offset , for three consecutive cells within a memory chunk of length , which forces the corresponding colours, related to , , and , to form a permutation of . In order to provide a sufficient memory to accommodate the disjoint cells in question, we take the numbers as in Definition 4 to satisfy Proposition 2.
Formally, we define to be the following quantifier-free symbolic heap:
and to be the following quantified symbolic heap:
where the existentially quantified variables are all variables occurring in that are not mentioned explicitly in .
Notice that both and are satisfiable and in minimal pointer arithmetic.
is satisfiable because does not impose any bounds on , so that we can use, for instance, distinct colours, which suffices to produce a perfect -colouring for any with vertices.
Proposition 2 takes care of making the corresponding cells disjoint.
Let be a -round -colouring instance. The entailment problem is valid iff there is a winning strategy for the perfect -colouring of , where and are the symbolic heaps given by Definition. 8.
Suppose that there is a winning strategy such that every -colouring of the leaves can be extended to a perfect -colouring of the whole . We will prove that .
Let be a stack-heap pair satisfying .
The spatial part of yields a decomposition of as the disjoint collection of the cells (we recall that and ):
Take the -colouring of the leaves obtained by assigning the colours to the leaves , ,…, resp.. where . According to the winning strategy, we can assign colours, denote them by , , to the rest of vertices , …, , resp., obtaining a -colouring of the whole such that no adjacent vertices share the same colour. In addition, we mark edges by complementary to and .
We extend the stack for quantified variables in so that for all ,
and, for each , we have . The fact that no adjacent vertices and share the same colour means that
is a permutation of
and, as a result, is also a model for :
As for the opposite direction, let . Since is satisfiable, there is a model for so that, in particular, satisfies (15).
We will construct the required winning strategy in the following way. Assume a -colouring of the leaves be given by assigning colours, say , to the leaves , ,…, respectively. We modify our original to a stack by defining, for each ,
which does not change the heap , but provides
It is clear that the modified is still a model for , and, hence, a model for . Then for some stack , which is extension of to the existentially quantified variables in , we get .
For each , , which means that, for , these represent correctly the original -colouring of the leaves.
By assigning the colours to the rest of vertices , , …, resp. we obtain a -colouring of the whole .
The spatial part of the form (16) provides that , which results in that no adjacent vertices and share the same colours and , providing a perfect -colouring of . ∎
The entailment problem is -hard, even for quantifier-free satisfiable formulas and quantified satisfiable formulas , both in minimal pointer arithmetic.
Via the -round -colourability problem, with Lemma 4. ∎
4.3 Quantifier-free Entailment
The entailment problem with quantifier-free is in .
(Cf. Remark 1) No small model property is valid whenever we allow , with being a variable.
Let and be symbolic heaps of the form (here ), both satisfiable:
is not valid, but for any polynomial , there is a number such that for all , there is no counter-model of size . ∎
Theorem 4.3 (“the small model property”).
Given and , quantifier-free symbolic heaps in minimal pointer arithmetic, suppose that is not valid. Then we can find a counter-model such that but , in which all values are bounded by , which suffices to take as: , where ranges over all occurrences of numbers occurred in and .
Follow the proof of Theorem 3.2. ∎
As for -hardness even for minimal pointer arithmetic, we will use a construction similar to Definition 4.
Taking notations from Definition 4, we introduce a satisfiable of the form:
and a satisfiable of the form:
Let be an instance of the -colouring problem. Then is not valid iff there is a perfect -colouring of .
Any perfect -colouring of yields a model for with , which implies that because of required there.
Conversely, the implication of the fact that, for some model , we have and is that is false. With the additional , provides a perfect -colouring of . ∎
The entailment problem is -hard, even for quantifier-free satisfiable formulas and , both in minimal pointer arithmetic.
The entailment problem is -complete, even for the quantifier-free satisfiable formulas and , both in minimal pointer arithmetic.
5 Quantified entailments: The upper bound
The lower bound is given in Theorem 4.1. For the case of quantified entailments in minimal pointer arithmetic, we establish here, Theorem 5.1, an upper bound also of , as well as the small model property.
In fact we prove that the upper bound is the same, so that minimal pointer arithmetic is -complete, even for the full pointer arithmetic but with a fixed pointer offset, where we allow any Boolean combinations of the elementary formulas , , and , and, in addition to the points-to formulas, we allow spatial formulas of the arrays the length of which is and lists which length is where is a fixed integer.
5.1 Entailment: A running example
With this example, we illustrate the crucial steps on the road to a smaller model.
Assuming, for simplicity, , let be of the form
and be of the form
Then in fact is a conjunction
and by Definition 6, we can also construct the corresponding constraint graph, , the labelled edges of which are given as follows: