I Introduction
One of the main challenges in automating software verification comes with handling inductive reasoning over programs containing loops. Until recently, automated reasoning in formal verification was the primary domain of satisfiability modulo theory (SMT) solvers
De Moura and Bjørner (2008); Barrett et al. (2011), yielding powerful advancements for inferring and proving loop properties with linear arithmetic and limited use of quantifiers, see e.g. Karbyshev et al. (2015); Gurfinkel et al. (2018); Fedyukovich et al. (2019). Formal verification however also requires reasoning about unbounded data types, such as arrays, and inductively defined data types. Specifying, for example as shown in Figure 1, that every element in the array b is initialized by a nonnegative array element of a requires reasoning with quantifiers and can be best expressed in manysorted extensions of firstorder logic. Yet, the recent progress in automation for quantified reasoning in firstorder theorem proving has not yet been fully integrated in formal verification. In this paper we address such a use of firstorder reasoning and propose trace logic , an instance of manysorted firstorder logic, to automate the partial correctness verification of program loops, by expressing program semantics in , and use in combination with superpositionbased firstorder theorem proving.Contributions
In our previous work Barthe et al. (2019), an initial version of trace logic was introduced to formalize and prove relational properties. In this paper, we go beyond Barthe et al. (2019) and turn trace logic into an efficient approach to loop (safety) verification. We propose trace logic as a unifying framework to reason about both relational and safety properties expressed in full firstorder logic with theories. We bring the following contributions.
(i) We generalize the semantics of program locations by treating them as functions of execution timepoints. In essence, unlike other works Bjørner et al. (2015); Kobayashi et al. (2020); Chakraborty et al. (2020); IshShalom et al. (2020), we formalize program properties at arbitrary timepoints of locations.
(ii) Thanks to this generalization, we provide a nonrecursive axiomatization of program semantics in trace logic and prove completeness of our axiomatization with respect to Hoare logic. Our semantics in trace logic supports arbitrary quantification over loop iterations (Section V).
(iii) We guide and automate inductive loop reasoning in trace logic , by using generic trace lemmas capturing inductive loop invariants (Section VI). We prove soundness of each trace lemma we introduce.
(iv) We bring firstorder theorem proving into the landscape of formal verification, by extending recent results in superpositionbased reasoning Gleiss et al. (2020); Gleiss and Suda (2020); Kovács et al. (2017) with support for trace logic properties, complementing SMTbased verification methods in the area (Section VI). As logical consequences of our trace lemmas are also loop invariants, superpositionbased reasoning in trace logic enables to automatically find loop invariants that are needed for proving safety assertions of program loops.
(v) We implemented our approach in the Rapid framework and combined Rapid with new extensions of the firstorder theorem prover Vampire. We successfully evaluated our work on more than 100 benchmarks taken from the SVComp repository Beyer (2019), mainly consisting of safety verification challenges over programs containing arrays of arbitrary length and integers (Section VII). Our experiments show that Rapid automatically proves safety of many examples that, to the best of our knowledge, cannot be handled by other methods.
Ii Running Example
We illustrate and motivate our work with Figure 1. This program iterates over a constant integer array a of arbitrary length and copies positive values into a new array b. We are interested in proving the safety assertion given at line 15: given that the length a.length of a is not negative, every element in b is an element from a. Expressing such a property requires alternations of quantifiers in the firstorder theories of linear integer arithmetic and arrays, as formalized in line 15. We write and to specify that are of sort integer .
While the safety assertion of line 15 holds, proving correctness of Figure 1 is challenging for most stateoftheart approaches, such as e.g. Gurfinkel et al. (2015); Karbyshev et al. (2015); Gurfinkel et al. (2018); Fedyukovich et al. (2019). The reason is that proving safety of Figure 1 needs inductive invariants with existential/alternating quantification and involves inductive reasoning over arbitrarily bounded loop iterations/timepoints. In this paper we address these challenges as follows.
(i) We extend the semantics of program locations to describe locations parameterized by timepoints, allowing us to express values of program variables at arbitrary program locations within arbitrary loop iterations. We write for example to denote the value of program variable i at location in a loop iteration , where the location corresponds to the program line 12. We reserve the constant for specifying the last program location , that is line 15, corresponding to a terminating program execution of Figure 1. We then write to capture the value of array b at timepoint and position . For simplicity, as a is a constant array, we simply write instead .
(ii) Exploiting the semantics of program locations, we formalize the safety assertion of line 15 in trace logic as follows:
(1) 
(iii) We express the semantics of Figure 1 as a set of firstorder formulas in trace logic , encoding values and dependencies among program variables at arbitrary loop iterations. To this end, we extend with socalled trace lemmas, to automate inductive reasoning in trace logic . One such trace lemma exploits the semantics of updates to j, allowing us to infer that every value of between to ), and thus each position at which the array b has been updated, is given by some loop iteration. Moreover, updates to j happen at different loop iterations and thus a position j at which b is updated is visited uniquely throughout Figure 1.
Iii Preliminaries
We assume familiarity with standard firstorder logic with equality and sorts. We write for equality and to denote that a logical variable has sort . We denote by the set of integer numbers and by the boolean sort. The term algebra of natural numbers is denoted by , with constructors and successor . We also consider the symbols and as part of the signature of , interpreted respectively as the predecessor function and lessthanequal relation.
Let be a firstorder formula with one free variable of sort . We recall the standard (stepwise) induction schema for natural numbers as being
(2) 
In our work, we use a variation of the induction schema (2) to reason about intervals of loop iterations. Namely, we use the following schema of bounded induction
where are term algebra expressions of , called respectively as left and right bounds of bounded induction.
Iv Programming Model
We consider programs written in an imperative whilelike programming language . This section recalls terminology from Barthe et al. (2019), however adapted to our setting of safety verification. Unlike Barthe et al. (2019), we do not consider multiple program traces in . In Section V, we then introduce a generalized program semantics in trace logic , extended with reachability predicates.
function  
func main()\{ context \}  
atomicStatement  
if( condition )\{ context \} else \{ context \}  
while( condition )\{ context \}  
statement; … ; statement 
Figure 2 shows the (partial) grammar of our programming model , emphasizing the use of contexts to capture lists of statements. An input program in has a single mainfunction, with arbitrary nestings of ifthenelse conditionals and whilestatements. We consider mutable and constant variables, where variables are either integervalued numeric variables or arrays of such numeric variables. We include standard sideeffect free expressions over booleans and integers.
Iva Locations and Timepoints
A program in is considered as sets of locations, with each location corresponding to positions/lines of program statements in the program. Given a program statement s, we denote by its (program) location. We reserve the location to denote the end of a program. For programs with loops, some program locations might be revisited multiple times. We therefore model locations corresponding to a statement s as functions of iterations when the respective location is visited. For simplicity, we write also for the functional representation of the location of s. We thus consider locations as timepoints of a program and treat them as being functions over iterations. The target sort of locations is . For each enclosing loop of a statement s, the function symbol takes arguments of sort , corresponding to loop iterations. Further, when s is a loop itself, we also introduce a function symbol with argument and target sort ; intuitively, corresponds to the last loop iteration of s. We denote the set of all function symbols as , whereas the set of all function symbols is written as .
Example 1
We refer to program statements s by their (first) line number in Figure 1. Thus, encodes the timepoint corresponding to the first assignment of i in the program (line 5). We write and to denote the timepoints of the first and last loop iteration, respectively. The timepoints and correspond to the beginning of the loop body in the second and the th loop iterations, respectively.
IvB Expressions over Timepoints
We next introduce commonly used expressions over timepoints. For each whilestatement w of , we introduce a function that returns a unique variable of sort for w, denoting loop iterations of w.
Let be the enclosing loops for statement s and consider an arbitrary term of sort . We define to be the expressions denoting the timepoints of statements s as
if s is nonwhile statement  
if s is whilestatement  
if s is whilestatement 
If s is a whilestatement, we also introduce to denote the last iteration of s. Further, consider an arbitrary subprogram p, that is, p is either a statement or a context. The timepoint (parameterized by an iteration of each enclosing loop) denotes the timepoint when the execution of p has started and is defined as
We also introduce the timepoint to denote the timepoint upon which a subprogram p has been completely evaluated and define it as
Finally, if is the topmost statement of the toplevel context in main(), we define
IvC Program Variables
We express values of program variables v at various timepoints of the program execution. To this end, we model (numeric) variables v as functions where gives the value of v at timepoint . For array variables v, we add an additional argument of sort , corresponding to the position where the array is accessed; that is, . The set of such function symbols corresponding to program variables is denoted by .
Our framework for constant, nonmutable variables can be simplified by omitting the timepoint argument in the functional representation of such program variables, as illustrated below.
Example 2
For Figure 1, we denote by the value of program variable i before being assigned in line 5.
As the array variable a is nonmutable (specified by const in the program), we write for
the value of array a at the position corresponding to the current value of i at timepoint .
For the mutable array b, we consider timepoints where
b has been updated and write for the array b at position j at the
timepoint during the loop.
We emphasize that we consider (numeric) program variables v to be of sort , whereas loop iterations are of sort .
IvD Program Expressions
Arithmetic constants and program expressions are modeled using integer functions and predicates. Let e be an arbitrary program expression and write to denote the value of the evaluation of e at timepoint .
Let , that is a function denoting a program variable v. Consider to be program expressions and let denote two timepoints. We define
to denote that the program variable v has the same values at and .
We further introduce
to define that all program variables have the same values at timepoints and . We also define
asserting that the numeric program variable v has been updated while all other program variables v’ remain unchanged. This definition is further extended to array updates as
V Axiomatic Semantics in Trace Logic
Trace logic has been introduced in Barthe et al. (2019), yet for the setting of relational verification. In this paper we generalize the formalization of Barthe et al. (2019) in three ways. First, (i) we define program semantics in a nonrecursive manner using the predicate to characterize the set of reachable locations within a given program context (Section VB). Second, and most importantly, (ii) we prove completeness of trace logic with respect to Hoare Logic (Theorem 2), which could have not been achieved in the setting of Barthe et al. (2019). Finally, (iii) we introduce the use of logic for safety verification (Section VI).
Va Trace Logic
Trace logic is an instance of manysorted firstorder logic with equality. We define the signature of trace logic as
containing the signatures of the theory of natural numbers (term algebra) and integers , as well the respective sets of timepoints, program variables and last iteration symbols as defined in section IV.
We next define the semantics of in trace logic .
VB Reachability and its Axiomatization
We introduce a predicate to capture the set of timepoints reachable in an execution and use to define the axiomatic semantics of in trace logic . We define reachability as a predicate over timepoints, in contrast to defining reachability as a predicate over program configurations such as in Hoder and Bjørner (2012); Bjørner et al. (2015); Fedyukovich et al. (2019); IshShalom et al. (2020).
We axiomatize using trace logic formulas as follows.
Definition 1 (predicate)
For any context , any statement s, let be the expression denoting a potential branching condition in s. We define
For any nonwhile statement occurring in context c, let
and for any whilestatement occurring in context c, let
Finally let .
Note that our reachability predicate allows specifying properties about intermediate timepoints (since those properties can only hold if the referred timepoints are reached) and supports reasoning about which locations are reached.
VC Axiomatic Semantics of
We axiomatize the semantics of each program statement in , and define the semantics of a program in as the conjunction of all these axioms.
Mainfunction
Let p be an arbitrary, but fixed program in ; we give our definitions relative to p. The semantics of p, denoted by , consists of a conjunction of one implication per statement, where each implication has the reachability of the starttimepoint of the statement as premise and the semantics of the statement as conclusion:
where is the set of iterations of all enclosing loops of some statement s in p, and the semantics of program statements s is defined as follows.
Skip
Let s be a statement skip. Then
(3) 
Integer assignments
Let s be an assignment v = e, where v is an integervalued program variable and e is an expression. The evaluation of s is performed in one step such that, after the evaluation, the variable v has the same value as e before the evaluation. All other variables remain unchanged and thus
(4) 
Array assignments
Consider s of the form a[e$_1$] = e$_2$, with a being an array variable and being expressions. The assignment is evaluated in one step. After the evaluation of s, the array a contains the value of e before the evaluation at position corresponding to the value of e before the evaluation. The values at all other positions of a and all other program variables remain unchanged and hence
(5) 
Conditional ifthenelse Statements
Let s be if(Cond)\{c$_1$\} else \{c$_2$\}. The semantics of s states that entering the ifbranch and/or entering the elsebranch does not change the values of the variables and we have
(6a)  
(6b) 
where the semantics of the expression Cond is according to Section IVD.
WhileStatements
Let s be the whilestatement while(Cond)\{c\}. We refer to Cond as the loop condition. The semantics of s is captured by conjunction of the following three properties: (7a) the iteration is the first iteration where Cond does not hold, (7b) entering the loop body does not change the values of the variables, (7c) the values of the variables at the end of evaluating s are the same as the variable values at the loop condition location in iteration . As such, we have
(7a)  
(7b)  
(7c) 
VD Soundness and Completeness.
The axiomatic semantics of in trace logic is sound. That is, given a program p in and a trace logic property , we have that any interpretation in is a model of according to the smallstep operational semantics of . We conclude the next theorem  and refer to Appendix LABEL:sec:soundness for details.
Theorem 1 (Soundness)
Let p be a program. Then the axiomatic semantics
is sound with respect to
standard smallstep operational semantics.
Next, we show that the axiomatic semantics of in trace logic is complete with respect to Hoare logic Hoare (1969), as follows.
Intuitively, a Hoare Triple corresponds to the trace logic formula
(8) 
where the expressions and denote the result of adding to each program variable in and the timepoints respectively as first arguments. We therefore define that the axiomatic semantics of is complete with respect to Hoare logic, if for any Hoare triple valid relative to the background theory , the corresponding trace logic formula (8) is derivable from the axiomatic semantics of in the background theory . With this definition at hand, we get the following result, proved formally in Appendix LABEL:sec:completeness.
Theorem 2 (Completeness with respect to Hoare logic)
The axiomatic semantics of in trace logic is complete with respect to Hoare logic.
Vi Trace Logic for Safety Verification
We now introduce the use of trace logic for verifying safety properties of programs. We consider safety properties expressed in firstorder logic with theories, as illustrated in line 15 of Figure 1. Thanks to soundness and completeness of the axiomatic semantics of , a partially correct program p with regard to can be proved to be correct using the axiomatic semantics of in trace logic . That is, we assume termination and establish partial program correctness. Assuming the existence of an iteration violating the loop condition can be help backward reasoning and, in particular, automatic splitting of loop iteration intervals.
However, proving correctness of a program p annotated with a safety property faces the reasoning challenges of the underlying logic, in our case of trace logic. Due to the presence of loops in , a challenging aspect in using trace logic for safety verification is to handle inductive reasoning as induction cannot be generally expressed in firstorder logic. To circumvent the challenge of inductive reasoning and automate verification using trace logic, we introduce
a set of firstorder lemmas, called trace lemmas, and extend the semantics of programs in trace logic with these trace lemmas. Trace lemmas describe generic inductive properties over arbitrary loop iterations and any logical consequence of trace lemmas yields a valid program loop property as well. We next summarize our approach to program verification using trace logic and then address the challenge of inductive reasoning in trace logic .
Via Safety Verification in Trace Logic
Given a program p in and a safety property ,

we express program semantics in trace logic , as given in Section V;

we formalize the safety property in trace logic , that is we express by using program variables as functions of locations and timepoints (similarly as in (1)). For simplicity, let us denote the trace logic formalization of also by ;

we introduce instances of a set of trace lemmas, by instantiating trace lemmas with program variables, locations and timepoints of p;

to verify , we then show that is a logical consequence of ;

however to conclude that p is partially correct with regard to , two more challenges need to be addressed. First, in addition to Theorem 1, soundness of our trace lemmas needs to be established, implying that our trace lemma instances are also sound. Soundness of implies then validity of , whenever is proven to be a logical consequence of sound formulas . However, to ensure that is provable in trace logic, as a second challenge we need to ensure that our trace lemmas , and thus their instances , are strong enough to prove . That is, proving that is a safety assertion of p in our setting requires finding a suitable set of trace lemmas.
In the remaining of this section, we address (v) and show that our trace lemmas are sound consequences of bounded induction (Section VIB). Practical evidence for using our trace lemmas are further given in Section VIIB.
ViB Trace Lemmas for Verification
Trace logic properties support arbitrary quantification over timepoints and describe values of program variables at arbitrary loop iterations and timepoints. We therefore can relate timepoints with values of program variables in trace logic , allowing us to describe the value distributions of program variables as functions of timepoints throughout program executions. As such, trace logic supports

reasoning about the existence of a specific loop iteration, allowing us to split the range of loop iterations at a particular timepoint, based on the safety property we want to prove. For example, we can express and derive loop iterations corresponding to timepoints where one program variable takes a specific value for the first time during loop execution;

universal quantification over the array content and range of loop iterations bounded by two arbitrary left and right bounds, allowing us to apply instances of the induction scheme (III) within a range of loop iterations bounded, for example, by and for some whilestatement s.
Addressing these benefits of trace logic, we
express generic patterns of inductive program properties as trace lemmas.
Identifying a suitable set of trace lemmas to automate inductive reasoning in trace logic is however challenging and domainspecific. We propose three trace lemmas for inductive reasoning over arrays and integers, by considering
 (A1)

one trace lemma
describing how values of program variables change during an interval of loop iterations;
 (B1B2)

two trace lemmas to describe the behavior of loop counters.
We prove soundness of our trace lemmas  below we include only one proof and refer to Appendix LABEL:sec:tracelemmas for further details.
(A1) Value Evolution Trace Lemma
Let w be a whilestatement, let v be a mutable program variable and let be a reflexive and transitive relation  that is or in the setting of trace logic. The value evolution trace lemma of w, v, and is defined as
(A1) 
In our work, the value evolution trace lemma is mainly instantiated with the equality predicate to conclude that the value of a variable does not change during a range of loop iterations, provided that the variable value does not change at any of the considered loop iterations.
Example 4
For Figure 1, the value evaluation trace lemma (A1) yields the property
which allows to prove that the value of b at some position j remains the same from the timepoint the value was first set until the end of program execution. That is, we derive .
We next prove soundness of our trace lemma (A1).
Proof (Soundness Proof of Value Evolution Trace Lemma (A1)) Let and be arbitrary but fixed and assume that the premise of the outermost implication of (A1) holds. That is,
(9) 
We use the induction axiom scheme (III) and consider its instance with , yielding the following instance of (III):
(10a)  
(10b)  
(10c) 
Note that the base case property (10a) holds since is reflexive. Further, the inductive case (10b) holds also since it is implied by (9). We thus derive property (10c), and in particular . Since is reflexive, we conclude , proving thus our trace lemma (A1).
(B1) Intermediate Value Trace Lemma
Let w be a whilestatement and let v be a mutable program variable. We call v to be dense if the following holds:
The intermediate value trace lemma of w and v is defined as
(B1) 
The intermediate value trace lemma (B1) allows us conclude that if the variable v is dense, and if the value is between the value of v at the beginning of the loop and the value of v at the end of the loop, then there is an iteration in the loop, where v has exactly the value and is incremented. This trace lemma is mostly used to find specific iterations corresponding to positions in an array.
(B2) Iteration Injectivity Trace Lemma
Let w be a whilestatement and let v be a mutable program variable. The iteration injectivity trace lemma of w and v is
(B2)  
The trace lemma (B2) states that a stronglydense variable visits each arrayposition at most once. As a consequence, if each array position is visited only once in a loop, we know that its value has not changed after the first visit, and in particular the value at the end of the loop is the value after the first visit.
Example 6
Trace lemma (B2) is necessary in Figure 1 to apply the value evolution trace lemma (A1) for b, as we need to make sure we will never reach the same position of j twice.
Based on the soundness of our trace lemmas, we conclude the next result.
Theorem 3 (Trace Lemmas and Induction)
Let p be a program. Let be a trace lemma for some whilestatement w of p and some variable v of p. Then is a consequence of the bounded induction scheme (III) and of the axiomatic semantics of in trace logic .
Vii Implementation and Experiments
Viia Implementation
We implemented our approach in the Rapid tool, written in C++ and available at https://github.com/gleiss/rapid.
Rapid takes as input a program in the whilelanguage together with a property expressed in trace logic using the smtlib syntax Barrett et al. (2017). Rapid outputs (i) the program semantics as in Section V, (ii) instantiations of trace lemmas for each mutable variable and for each loop of the program, as discussed in Section VIB, and (iii) the safety property, expressed in trace logic and encoded in the smtlib syntax.
For establishing safety, we pass the generated reasoning task to the firstorder theorem prover Vampire Kovács and Voronkov (2013) to prove the safety property from the program semantics and the instantiated trace lemmas^{1}^{1}1We also established the soundness of each trace lemma instance separately by running additional validity queries with Vampire., as discussed in Section VIA. Vampire searches for a proof by refuting the negation of the property based on saturation of a set of clauses with respect to a set of inference rules such as resolution and superposition.
In our experiments, we use a custom version^{2}^{2}2https://github.com/vprover/vampire/tree/gleissrapid of Vampire with a timeout of 60 seconds, in two different configurations. On the one hand, we use a configuration Rapid, where we tune Vampire to the trace logic domain using (i) existing options and (ii) domainspecific implementation to guide the highlevel proof search. On the other hand, we use a configuration Rapid, which extends Rapid with recent techniques from Gleiss et al. (2020); Gleiss and Suda (2020) improving theory reasoning in equational theories. As such, Rapid represents the result of a fundamental effort to improve Vampire’s reasoning for software verification. In particular, theory split queues Gleiss and Suda (2020) present a partial solution to the prevalent challenge of combining quantification and lightweight theory reasoning, drastically improving firstorder reasoning in applications of software verification, as shown next.
ViiB Experimental Results
We considered challenging Java and Clike verification benchmarks from the SVComp repository Beyer (2019), containing the combination of loops and arrays. We omitted those examples for which the task is to find bugs in form of counterexample traces, as well as those examples that cannot be expressed in our programming model , such as examples with explicit memory management. In order to improve the set of benchmarks, we also included additional challenging programs and functional properties. As a result, we obtained benchmarks ranging over 45 unique programs with a total of 103 tested properties. Our benchmarks are available in the Rapid repository^{3}^{3}3https://github.com/gleiss/rapid/tree/master/examples/arrays.
We manually transformed those benchmarks into our input format. SVComp benchmarks encode properties featuring universal quantification by extending the corresponding program with an additional loop containing a standard Clike assertion. For instance, the property
would be encoded by extending the program with a loop
for(int i = 0; i < a.length; i++)  
assert(P(a[i])) 
While this encoding loses explicit structure and results in a harder reasoning task, it is necessary as other tools do not support explicit universal quantification in their input language. In contrast, our approach can handle arbitrarily quantified properties over unbounded data structures. We, thus, directly formulate universally quantified properties, without using any program transformations.
The results of our experiments are presented in Table 1. We divided the results in four segments in the following order: the first eleven problems are quantifierfree, the largest part of 62 problems are universally quantified, seven problems are existentially quantified, while the last 23 problems contain quantifier alternations. First, we are interested in the overall number of problems we are able to prove correct. In the configuration Rapid, which represents our main configuration, Vampire is able to prove 78 out of 103 encodings. In particular, we verify Figure 1, corresponding to benchmark copy_positive_1, as well as other challenging properties that involve quantifier alternations, such as partition_5.
Second, we are interested in comparing the results for configurations Rapid and Rapid, in order to understand the importance of recently developed techniques from Gleiss et al. (2020) and Gleiss and Suda (2020) for reasoning in the trace logic domain. While Rapid is only able to prove 15 out of 103 properties, Rapid is able to prove 78 properties, that is, Rapid improves over Rapid by 63 examples. Moreover, only Rapid is able to prove advanced properties involving quantifier alternations. We therefore see that Rapid drastically outperforms Rapid, suggesting that the recently developed techniques are essential for efficient reasoning in trace logic.
Third, we are interested in what kinds of properties Rapid can prove. It comes with no surprise that all quantifierfree instances could be proved. Out of 62 universally quantified properties, Rapid could establish correctness of 53 such properties. More interestingly, Rapid proves 14 out of 30 benchmarks containing either existentially quantified properties or such with quantifier alternations. The benchmarks that could not be solved by Rapid are primarily universally and alternatingly quantified properties that need additional trace lemmas relating values of multiple program variables.
Comparing with other tools.
We compare our work against other approaches in VIII. Here, we omit a direct comparison of Rapid with other tools for the following reasons:
(1) Our benchmark suite includes 62 universally quantified and 11 nonquantified properties that could technically be supported by stateoftheart tools such as Spacer/SeaHorn and FreqHorn. Our benchmarks, however, also include 30 benchmarks with existential (7 examples) and alternating quantification (23 examples) that these tools cannot handle. As these examples depend on invariants that are alternatingly or at least existentially quantified, we believe these other tools cannot solve these benchmarks, while Rapid could solve 14 examples in this domain.
(2) In our preliminary work Barthe et al. (2019), we already compared our reasoning within Rapid against Z3 and CVC4. These experiments showed that due to the fundamental difference in handling variables as functions over timepoints in our semantics, Rapid outperformed SMTbased reasoning approaches.
(3) Our program semantics is different than the one used in Horn clause verification techniques.
Concerning previous approaches with firstorder reasoners, the benchmarks of Gleiss et al. (2018) represent a subset of 55 examples from our current benchmark suite: only 21 examples from our benchmark suite could be proved by Gleiss et al. (2018). For instance, our example in Figure 1 could not be proven in Gleiss et al. (2018). We believe that our work can be combined with approaches from Kovács and Voronkov (2009); Gleiss et al. (2018) to nontrivial invariants and loop bounds from saturationbased proof search. Our work can, thus, complement existing tools in proving complex quantified properties.
Viii Related Work
Our work is closely related to recent efforts in using firstorder theorem provers for proving software properties Kovács and Voronkov (2009); Gleiss et al. (2018). While Gleiss et al. (2018) captures programs semantics in the firstorder language of extended expressions over loop iterations, in our work we further generalize the semantics of program locations and consider program expressions over loop iterations and arbitrary timepoints. Further, we introduce and prove trace lemmas to automate inductive reasoning based on bounded induction over loop iterations. Our generalizations in trace logic proved to be necessary to automate the verification of properties with arbitrary quantification, which could not be effectively achieved in Gleiss et al. (2018). Our work is not restricted to reasoning about single loops as in Gleiss et al. (2018).
Compared to Barthe et al. (2019), we provide a nonrecursive generalization of the axiomatic semantics of programs in trace logic, prove completeness of our axiomatization in trace logic, ensure soundness of our trace lemmas and use trace logic for safety verification.
In comparison to verification approaches based on program transformations Kobayashi et al. (2020); Chakraborty et al. (2020); Yang et al. (2019), we do not require userprovided functions to transform program states to smallersized states IshShalom et al. (2020), nor are we restricted to universal properties generated by symbolic executions Chakraborty et al. (2020). Rather, we use only three trace lemmas that we prove sound and automate the verification of firstorder properties, possibly with alternations of quantifiers.
The works Dillig et al. (2010); Cousot et al. (2011) consider expressive abstract domains and limit the generation of universal invariants to these domains, while supporting potentially more generic program grammars than our language. Our work however can verify universal and/or existential firstorder properties with theories, which is not the case in Kobayashi et al. (2020); Chakraborty et al. (2020); Dillig et al. (2010); Cousot et al. (2011). Verifying universal loop properties with arrays by implicitly finding invariants is addressed in Gurfinkel et al. (2018); Fedyukovich et al. (2019); Komuravelli et al. (2015); Fedyukovich et al. (2017); Fedyukovich and Bodík (2018); Matsushita et al. (2020), and by using constraint horn clause reasoning within propertydriven reachability analysis in Hoder and Bjørner (2012); Cimatti and Griggio (2012).
Another line of research proposes abstraction and lazy interpolation
Alberti et al. (2012); Afzal et al. (2020), as well as recurrence solving with SMTbased reasoning Rajkhowa and Lin (2018). Synthesisbased approaches, such as Fedyukovich et al. (2019), are shown to be successful when it comes to inferring universally quantified invariants and proving program correctness from these invariants. Synthesisbased term enumeration is used also in Yang et al. (2019) in combination with userprovided invariant templates. Compared to these works, we do not consider programs only as a sequence of states, but model program values as functions of loop iterations and timepoints. We synthesize bounds on loop iterations and infer firstorder loop invariants as logical consequences of our trace lemmas and program semantics in trace logic.Ix Conclusion
We introduced trace logic to reason about safety loop properties over arrays. Trace logic supports explicit timepoint reasoning to allow arbitrary quantification over loop iterations. We use trace lemmas as consequences of bounded induction to automated inductive loop reasoning in trace logic. We formalize the axiomatic semantics of programs in trace logic and prove it to be both sound and complete. We report on our implementation in the Rapid framework, allowing us to use superpositionbased reasoning in trace logic for verifying challenging verification examples. Generalizing our work to termination analysis and extending our programming language, and its semantics in trace logic, with more complex constructs are interesting tasks for future work.
Acknowledgements. This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE, and the Austrian FWF research project W1255N23.
References
 VeriAbs: verification by abstraction and test generation (competition contribution). In TACAS, pp. 383–387. Cited by: §VIII.
 Lazy abstraction with interpolants for arrays. In LPAR, pp. 46–61. Cited by: §VIII.
 CVC4. In CAV, pp. 171–177. Cited by: §I.
 The SMTLIB standard: version 2.6. Technical report Department of Computer Science, The University of Iowa. Note: Available at www.SMTLIB.org Cited by: §VIIA.
 Verifying relational properties using trace logic. In FMCAD, pp. 170–178. Cited by: §I, §IV, §V, §VIIB, §VIII.
 Automatic verification of c and java programs: svcomp 2019. In TACAS, pp. 133–155. Cited by: §I, §VIIB.
 Horn clause solvers for program verification. In Fields of Logic and Computation II, pp. 24–51. Cited by: §I, §VB.
 Verifying array manipulating programs with fullprogram induction. In TACAS, pp. 22–39. Cited by: §I, §VIII, §VIII.
 Software model checking via ic3. In CAV, pp. 277–293. Cited by: §VIII.
 A Parametric Segmentation Functor for Fully Automatic and Scalable Array Content Analysis. In POPL, pp. 105–118. Cited by: §VIII.
 Z3: an efficient SMT solver. In TACAS, pp. 337–340. Cited by: §I.
 Fluid Updates: Beyond Strong vs. Weak Updates. In ESOP, pp. 246–266. Cited by: §VIII.
 Accelerating syntaxguided invariant synthesis. In TACAS, pp. 251–269. Cited by: §VIII.
 Sampling invariants from frequency distributions. In FMCAD, pp. 100–107. Cited by: §VIII.
 Quantified invariants via syntaxguided synthesis. In CAV, pp. 259–277. Cited by: §I, §II, §VB, §VIII, §VIII.
 Subsumption demodulation in firstorder theorem proving. In IJCAR, Cited by: §I, §VIIA, §VIIB.
 Loop analysis by quantification over iterations. In LPAR, pp. 381–399. Cited by: §VIIB, §VIII.
 Layered clause selection for theory reasoning. In IJCAR, Cited by: §I, §VIIA, §VIIB.
 The seahorn verification framework. In CAV, pp. 343–361. Cited by: §II.
 Quantifiers on demand. In ATVA, pp. 248–266. Cited by: §I, §II, §VIII.
 An axiomatic basis for computer programming. Communications of the ACM 12 (10), pp. 576–580. Cited by: §VD.
 Generalized property directed reachability. In SAT, pp. 157–171. Cited by: §VB, §VIII.
 Putting the squeeze on array programs: loop verification via inductive rank reduction. In VMAI, pp. 112–135. Cited by: §I, §VB, §VIII.
 Propertydirected inference of universal invariants or proving their absence. In CAV, pp. 583–602. Cited by: §I, §II.
 Fold/unfold transformations for fixpoint logic. In TACAS, pp. 195–214. Cited by: §I, §VIII, §VIII.
 Compositional verification of procedural programs using horn clauses over integers and arrays. In FMCAD, pp. 89–96. Cited by: §VIII.
 Coming to terms with quantified reasoning. In POPL, pp. 260–270. Cited by: §I.
 Finding loop invariants for programs over arrays using a theorem prover. In FASE, pp. 470–485. Cited by: §VIIB, §VIII.
 Firstorder theorem proving and vampire. In CAV, pp. 1–35. Cited by: §VIIA.
 RustHorn: chcbased verification for rust programs. In ESOP, pp. 484–514. Cited by: §VIII.
 Extending viap to handle array programs. In VSTTE, pp. 38–49. Cited by: §VIII.
 Lemma synthesis for automating induction over algebraic data types. In CP, pp. 600–617. Cited by: §VIII, §VIII.
Comments
There are no comments yet.